Uncategorized

Decoding the Future with Artificial Intelligence

Decoding the Future with Artificial Intelligence

One technology that has dominated news in recent times is Artificial Intelligence (AI). It has seen massive adoption from enterprises to startups, ranging from the private to the public sector. In fact, ChatGPT, that is now synonymous with Generative AI, saw over a million users in just five days!

With more innovations and advancements taking place, it makes us wonder how the inevitable AI-powered future would look in India. So, join us as we explore the impact of Artificial Intelligence on our future!

So, What Is Artificial Intelligence?


Artificial intelligence, better known as AI, essentially trains a computer to mimic human thinking and learns to perform certain tasks to solve problems. Yet, this definition does not do justice, as AI is a diverse field with various
use cases in daily life.This technology is set to revolutionize how we work, communicate, collaborate, travel and go about our day-to-day activities. That’s why every artificial intelligence company in India is looking to create products and services that will make an AI-enabled future more feasible.

So, what does the future hold for this transformative technology?

 

What Is The Future With AI In India?

The Indian artificial intelligence market is expected to grow from USD 3.1 billion in 2020 to USD 7.8 billion by 2025, showing a compound annual growth rate of 20.2%. This shows great potential for artificial intelligence companies in India to address the growing demand for AI-driven offerings. Furthermore, according to to NASSCOM, the ‘AI Skills Penetration Factor’ is the highest for the Indian workforce, with Indian talent being three times more likely to possess AI skills among G20 and OECD nations. Naturally, the future of AI in India looks bright, with leading AI applications such as:

Virtual Assistants: AI-powered virtual assistants will understand and respond to natural language, either spoken or written, making our interactions with devices more intuitive. Think of having Siri or Alexa embedded into machines that we interact with in our daily routines!In fact, Volkswagen has just introduced vehicles with in-built ChatGPT functionalities. With strides in Natural Language Processing (NLP) technologies, virtual AI assistants are an inevitable outcome in man-machine interactions. Artificial Intelligence companies in India such as Haptik and AI4Bharat are developing AI assistants that can converse in several Indian languages too.

Recommendation Systems: AI algorithms can make suggestions based on user preferences, as they can analyze vast volumes of data to find trends, patterns and relationships within them. This will enhance our everyday experiences in the future.From better recommendations on shopping sites to quicker routes on navigation apps, AI will deliver tailored recommendations across a wide spectrum of services. In India, Tata Elxsi is offering AI-based recommendation systems for businesses in industries such as healthcare, banking and e-commerce.

Generative AI: Tools such as ChatGPT, DALL-E and Midjourney have taken the world by storm. So, these technologies mature, generative AI models will get better at understanding contextual information and churning out tailored outputs. This will enable users to create text, video, image, code and other assets by simply entering a relevant prompt.For instance, FusionwaveAI is an Indian Artificial Intelligence company that provides AI solutions for Web 3.0 services, while SplineAI offers generative AI solutions for smart healthcare platforms.

Autonomous Driving: With improvements in AI coming thick and fast, self-driving cars are likely to become more sophisticated (and possibly the norm) in the future. Autonomous vehicles will boast improved safety and convenience, as AI models can make split-second decisions based on hundreds of inputs.In fact, the most promising outlook for AI-based computer vision is its increasing integration into vehicles, creating autonomous cars that can drive without human intervention. An AI company in India notable for its contribution to this domain is Minus Zero, a Bengaluru-based AI startup that unveiled India’s first fully-autonomous vehicle, zPod, in 2023.

Conclusion

While AI will have a major say in the future, every artificial intelligence developer must consider the risks and challenges.Today, AI development and research is confined to technology giants and research institutions. Yet, the promise of democratized AI development is will be critical in the future, where user-friendly platforms will make the power of AI accessible to everyone.

That’s what we aim to do at EaseMyAI – so, check out our AI products and be future-ready!

Blog

Loss function in Machine Learning

Loss function in AI and Deep learning

Loss function in AI and Deep Learning measures how correctly your AI model predicts the expected outcome. A classroom image to explain loss function in AI

We use different loss functions for regression analysis and classification analysis.

Let us understand through a simple example.

Consider you have created an AI model to predict whether the student fits in Grade A, A+, or A-. It calculates a few parameters like marks of all the subjects in theory, practical, and projects to predict the grades.

The system may not predict correct grades in the first go resulting in a loss. This loss is used in backpropagation to reduce faulty predictions.

Learn how to get better accuracy by using Activation functions – Click here 

Loss = Desired output – actual output (Expected – reality)

If your loss function value is low, your model will provide good results. Thus, we should try to get a minimum loss value (high accuracy).

After forward propagation, we find the loss and reduce it. Now, let us learn about different types of loss functions.

How to reduce loss? What is loss function in machine learning? What is cost function?

 

We calculate the loss using the loss function and cost function.

The terms cost and loss functions almost all refer to the same meaning. The cost function measures error on a group of objects, and the loss function deals with a single data instance.

 

Loss function: Refers to error for a single training example.
Cost function: Refers to an average loss over an entire training dataset.

Do you recollect the above example to grade the students? The loss function evaluates the model’s performance for one student; the cost function evaluates the performance of the entire class. Therefore, to optimize our model, we need to minimize the value of the cost function.

We use different types of loss functions for different types of Deep learning problems.

There are two types of problems in supervised Deep Learning

Regression and classification

Regression

The salary of an employee depends on the experience the employee has. Consequently, salary is a dependent variable known as the target, and experience is an independent variable known as the predictor.

Regression analysis explains to us how the value of the dependent variable changes based on the independent variable. It is a supervised learning technique that helps us to find the correlation between variables and predict the continuous output variable. We use it for prediction, forecasting, and determining the causal-effect relationship between variables.

The loss functions used for regression examples –

We use distance-based error as follows:

Error = y – y’
Where y = actual output & y’ = predicted output

The most used regression cost functions are below.

 

1. Mean error (ME):

Mean error = Sum of all errors /number of training examples

= {( Y1 – Y1’) + ( Y2 – Y2’ ) + ( Y3 – Y3’ ) + ….. + ( Yn – Yn’)}
n
= (+100 + -100) / 2
= 0 / 2 = 0.

· Here, we calculate the error for each training data and derive their mean value.
· The errors can be positive and negative,  sometimes resulting in a zero mean error for the model.
· Therefore, this is used as a base for other cost functions.

2. Mean Squared Error (MSE)

 

We use the mean squared error to get rid of the zero mean error. MSE is also known as L2 loss. We consider the square of the difference between the predicted and the actual value.

Mean Squared Error MSE in Regression Analysis

Advantages

· We can penalize the tiny deviations in predictions compared to MAE.
· It has only one local minima, and one global minima; converges faster, and is differentiable.

Disadvantages

· It is not robust to outliers. These outliers contribute to higher prediction errors, and squaring them further magnifies the error.
This equation is quadratic, resulting in gradient descent.

Info byte

 

OUTLIERS:

 

Say if we are predicting the value of the salary of a candidate. The salary depends on the experience of the candidate. And for that, we have a dataset of the salary and experience. Here, the salary might be very high or low in some cases, irrespective of their experience. These exceptions are known as outliers in a dataset that contribute to higher prediction errors.

3. Mean Absolute Error (MAE)

 

MAE eliminates the ME shortcoming differently. Here, we consider the absolute difference between the actual and predicted value. It is measured as an average sum of absolute differences between actual and predicted values. MAE is also known as L1 loss.
MAE is Mean absolute error in regression analysis

Advantages

· Robust to outliers since we take the absolute value instead of squaring the errors.
· Gives better results even when the dataset has noise or outliers.

Disadvantages

1. It is a time-consuming function.

4. Huber Loss

 

Say we have salary values of 100 employees. Out of which, 12 employees have very high salaries, and 12 have very low salaries. Though these extremes fit in the definition of outliers, we cannot ignore them as they are about 25 percent of the dataset.
What do we do here? We use the Huber loss function.

The Huber loss function can be used to balance between MSE and MAE. It is a blend of mean squared error (MSE) and mean absolute error (MAE).

Huber loss in regression analysis

One thing we have to define here is the delta value. Delta is the hyperparameter that decides the range of MSE and MAE. When the error is less than the delta, the error is quadratic, otherwise absolute.
Delta can be iterative to find a correct delta value.
This equation says that, for a loss value less than delta, use the MSE. Use MSE when no outliers are present in the dataset. For loss value greater than delta, use the MAE.

Classification

Calculating the loss in classification is a little tricky.

Infobyte

 

What is entropy?

 

Entropy is a measure of the randomness, unpredictability, disorder, or impurity in a system.

Classification tasks are those in which the given data is classified into two or more categories.

For example, the classification of dogs and cats.

Types of Classification

In general, there are three main types/categories for Classification in machine learning:

A. Binary classification

This includes two target classes.
• Is it a cat in the picture? Yes or No

Object Target Prediction(Probability)
Yes 1 0.8
No 0 0.2

• Is it a cat or a dog in the picture? Cat or Dog

B. Multi-Class Classification

Classification

Object Target Prediction(Probability)
Dog 1 0.5
Cat 0 0.2
Wolf 0 0.3

The prediction of a model is a probability, and all these probabilities add up to 1. The target for multi-class classification is one hot vector, which means it has 1 on a single position and 0 everywhere else.

First, let us start calculating separately for each class and then sum it up.

loss function in classification

Loss function in Classification Loss function in Classification cross entropy

Binary entropy is a particular case of entropy used if our target is either 1 or 0. In a neural network, we achieve this prediction by using the Sigma activation function –

How to create a classifier – Check here.

Say I want to predict whether the image contains a cat or not.

Binary loss function

This is as simple as saying for 0.8 probability, which means 80 % it’s a cat and 20 % it is not a cat.

Binary cross entropy

C. Multi-label classification

 

In multi-label classification, an image can contain more than one label. Therefore, our target and predictions are not probability vectors. It’s possible that all the classes are present in one image, and even none at all sometimes.

Multi-label classification

Here, we look at this problem as a multiple-binary classification subtask. Let us first predict if there is a cat or not.

loss function for classification

 

Done! Great work, you managed to read till here. Want some more exciting examples  – check here.

Authors:

Author of loss function in AI

 

 

Blog

Activation Function

Before we start with the activation function, let us quickly learn about a model. The article is a little longer because it has examples to make it simple for you to understand.

 

What is a model?

A model consists of 3 parts – input, actions on the input, and desired output.

We have input; we perform actions on it to get the desired output.

A model consists of input, action on input and output.

To know the basics of AI – click here

 

What is the activation function?

Girl crying due to burn 

An activation function is an action we perform on the input to get output. Let us understand it more clearly.

We all know that deep learning (DL), a part of Artificial Intelligence (AI) is a replica of the neural network in a human brain. For example, if you burn a little, you may/may not scream; if you burn terribly, you shout so loudly that the entire building knows.

 

Similarly, an activation function decides whether a neuron must be activated or not. It is a function used in DL which outputs a small value for small inputs and a large value if the input exceeds the threshold.

If the inputs are large enough (like a severe burn), the activation function activates, otherwise does nothing. In other words, an activation function is like a gate that checks that an incoming value is greater than a critical number (threshold).

 

Like the neuron-based model in our brain, the activation function decides what must be forwarded to the next neuron. The activation function takes the output from the previous cell (neuron) and converts it to input for the next cell.

 

Human Analogy

An old man giving chocolate

You see a senior citizen distributing free chocolates, your brain senses it as a tempting offer, and then it passes to the next neurons (legs) that you have to start running towards him(output from the preceding neuron).

Once you reach there, you will extend your hand to get the chocolate. So your output of every neuron is input for your upcoming action.

 

 

 

Why is activation function important?

The activation function in a neural network decides whether or not a neuron will be activated and transferred to the next layer. It determines whether the neuron’s input to the network is relevant or not for prediction, detection, and more.

 

It also adds non-linearity to neural networks and helps to learn powerful operations.

 

If we remove the activation function from a feedforward neural network, the network would be re-factored to a simple linear operation or matrix transformation on its input; it would no longer be capable of performing complex tasks such as image recognition.

 

Now let us discuss some commonly used activation functions.

 

1. Sigmoid Activation FunctionSigmoid Activation Function

 

Mainly used to solve non-linear problems. A non-linear problem is where the output is not proportional to the change in the input. We can use the sigmoid activation function to solve binary classification problems.

Consider an example,

Students appear for an examination, and the faculty designs an AI model to declare the results. They set criteria that students scoring more than 50 % percent are pass and below 50 % fail. So the inputs are the percentages, and the binary classification takes place using the sigmoid activation function.

 

If the percentage is 50 percent or above, it will give the output 1(pass)

Otherwise, it will give the output 0 (fail).

 

Output value – 0 to 1

If value >= 0.5, Output = 1

If value < 0.5, Output = 0

Derivative of sigmoid – 0 to 0.25.

 

 

 

 

What happens in the neural network?

 

A weight is assigned to input in the neural network. Different inputs have different weights. The weight is multiplied with the input, and at the next layer, all the products(w*x) are added.

Neural Network

 

 

∑wi*xi = x1*w1 + x2*w2 +…xn*wn

 

Based on these weights and activation, we get an output. Naturally, the system might make some mistakes while learning. (It might consider 55% as fail). In this case, to teach the system better, we take the derivative of the function and send it back to change the weights for correction. (Like a feedback mechanism)

 

Glance the formula for your understanding. Skip if it confuses you.

Neural Network new weight formula The derivative of the function is crucial for feedback mechanisms and corrections. Its range is only 0-0.25, which is a limitation for corrections. The feedback mechanisms and corrections are backward propagations. The outputs are considered as inputs to improve the accuracy.

 

Pros

  • Gives you a smooth gradient while converging, preventing jumps in output values.
  • One of the best Normalised functions.
  • Gives a clear prediction (classification) with 1 & 0; like pass/fail in above example.

 

Cons

  • Prone to Vanishing Gradient problem. The range of derivative is between 0-0.25, if used in deep neural networks, after some layers you will get very small values, and weights will not update. This problem is called the Vanishing Gradient problem. If your neural network has more hidden layers (it is deep) then this problem occurs easily.
  • Not a zero-centric function (Does not pass through 0).
  • Computationally expensive function (exponential in nature).

 

2. Tanh Activation Function

Tanh Activation function

Tanh is called a hyperbolic tangent function. Generally, used as the input of a binary probabilistic function. To solve the binary classification problems, we use the tanh function. In the tanh activation function, the range of the values is between -1 to 1. And derivatives of tanh are between 0 – 1.

 

Note – To solve the binary classification problem, we can use tanh for the hidden layer (to improve the vanishing gradient problem) and sigmoid for the output layer. However, the chances of a vanishing gradient remain.

 

Pros

• It is a smooth gradient converging function.

• Zero-centric function, unlike Sigmoid.

  

Cons

• Derivatives of tanh function range between 0-1. It is better than the sigmoid activation function but does not solve the vanishing gradient problem in backpropagation for deep neural networks.

• Computationally expensive function (exponential in nature).

 

3. relu Activation Function

 

Relu Activation Function  

Relu is Rectified linear unit; currently a more popular activation function. It solves linear problems. Range of values of ReLU: 0 – max.

ReLU = max(0 , x)

Derivatives of relu: 0 – 1.

 

Pros

• Deals with vanishing gradient problems.

• Computationally inexpensive function (linear in nature).

• Calculation speed much faster.

 

Cons

• If one of the weights in derivatives becomes 0, then that neuron will be completely dead during backpropagation.

• Not a zero-centric function.

 

 

4. Leaky ReLU Activation Function

Leaky Relu Activation Function

Use leaky relu to solve the dead ReLU problem. In leaky relu, the negative values will not be zero. The derivative will have a small value when a negative number is entered.

Leaky ReLU = max(0.01x , x)

As for the ReLU activation function, the gradient is 0 for all the values of inputs less than zero, which would deactivate the neurons in that region and may cause a dying ReLU problem.

 

Leaky ReLU is defined to address this problem. Instead of defining the ReLU activation function as 0 for negative values of inputs(x), we define it as an extremely small linear component of x. Here is the formula for the Leaky ReLU activation function

 

f(x)=max(0.01*x , x)

 

This function returns x if it receives any positive input, but for any negative value of x, it returns a small value that is 0.01 times x. Thus it gives an output for negative values as well. The gradient on the left side of the graph is a non-zero value. We no longer encounter dead neurons in that region.

 

Pros

  • To solve the dead neuron problem.

 

5. Elu Activation Function

 

Elu Activation Function

Elu is exponential linear units.

If x>0, then

Whenever the x value is greater than 0, we use the x value, else we apply the below function.

y = x ;  if x>0

y = α.(ex–1) ; if x<=0

Pros

• Gives smoother convergence for any negative value.

 

Cons

• Slightly computationally expensive because using of exponential value.

 

 

 

6. PReLU Activation Function

Prelu Activation Function

Parametric relu. PReLU has a learning parameter function that fine-tunes the activation function based on its learning rate (unlike zero in the case of RELU and 0.01 in the case of Leaky RELU).

If ax = 0, y will be ReLU’

If ax > 0, y will be Leaky ReLU

If ax is a learnable parameter, y will be PReLU

 

Pros

  • It has the learning parameter function which fine-tunes the activation function based on its learning rate (unlike zero in the case of RELU and 0.01 in the case of Leaky RELU).

 

 

 

7. Swish Activation Function

 

Swish Activation Function

Swish is a smooth continuous function, unlike ReLU, which is a piecewise linear function. Swish allows a small number of negative weights to propagate, while ReLU thresholds all negative weights to zero. It is crucial for deep neural networks. The trainable parameter tunes the activation function better and optimizes the neural network. It is a self-gating function since it modulates the input by using it as a gate to multiply with the sigmoid itself, a concept first introduced in Long Short-Term Memory (LSTMs).

  

Pros

• Deals with vanishing gradient problem.

• The output is a workaround between ReLU and sigmoid function which helps to normalize the output.

 

Cons

• Cannot find out derivatives of zero.

• Computationally expensive function (as of sigmoid).

 

 

 

 

8. Softmax Activation Function

Softmax is used for solving multiclass classification problems. It finds out different probabilities for different classes.

It is used in the output layer, for neural networks that classify the input into multiple categories.

 

Softmax Activation Function

 

Tips for beginners

Q – Which activation function solves the binary classification problem?

A – For the hidden layer, use ReLU/ PreLU/Leaky ReLu, and for the output layer, use the sigmoid activation function.

 

Q – Which activation function solves the multiclass classification problem?

A – For the hidden layer, use ReLU/PreLU/Leaky ReLu, and for the output layer, use the softmax activation function.

 

Well Done! You ended up learning till here.

For more activation functions – Click here

Authors

authors - Ankita Gotarne & Janvi Bhanushali

Blog

Image Classification

What is Image Classification?

Image classification means assigning labels to an image based on a predefined set of classes.

Practically this means that our task is to analyze an input image and return a label that categorizes the image. That label is always from a predefined set of possible categories.

For example – Check here.

Let us understand image classification through an analogy.

Explanation of image classification through the body parts example

In a fourth-standard classroom, teacher Smita is teaching organs of the body to students. The teacher will show the children an image of each organ and give a title/label for it. She will show an image of a heart and point out to students that this is the heart. Similarly, she will show images of all the organs with their labels. The teacher will repeat this exercise and do revisions until it is clear to the students which organ looks like what.

In image classification, we teach the system by showing images and labels of predefined categories.

How do we create image classification models? How do we teach the systems to classify the images accurately? 

We need to follow some steps to create an Image Classifier. Technically, we need to follow a classification pipeline to train the system to classify images.

Classification Pipeline

Image classification block diagram

The basic idea is to build an image classification model with Convolutional Neural Networks. We use a data-driven approach despite coding a rule-based algorithm to classify images. In a data-driven approach, we supply examples of what each category looks like and then teach our algorithm to recognize the difference between the categories using these examples.

We call these examples – the training dataset. It consists of images and labels associated with each image like {tom, jerry, spike}. 

It is crucial to give these examples to the system for supervised learning. These labels teach the system how to recognize each category. (Recall the organs of the body example – how the teacher points out which organ looks like what)

Now that we know what an image classifier model is. Let us understand how to create a Deep-Learning Image classifier model step-by-step.

Classification Pipeline:

Image Classification steps: 1. Collect Dataset 2. Split Dataset 3. Autotune 4. Train and Test

The classification pipeline has 5 steps: 

  1. Collect Data: collect and preprocess the raw data.  
  2. Split Data: split the preprocessed data into train, validation, and test data. 
  3. Autotune: find the best parameters on the validation data. 
  4. Train: train the final model with the best parameters on all the data. 
  5. Test: get metrics and predictions on test data. 

Step 1: Gather your Dataset

We need images and labels associated with each image. These images and their labels form our dataset. The labels should be from a finite and predefined set of categories like:

Categories – tom, jerry, spike.

Things to keep in mind:

  • The number of images from each category should be approximately uniform. Like 1000 images for Tom, 1000 for Jerry, and 1000 for Spike.
  • If we keep 2000 images for Jerry, our classifier will become naturally biased to this heavily represented category.
  • To prevent bias, avoid class imbalance and gather a uniform number of images for each category.

Step 2: Split Your Dataset

After gathering the initial data, we split it into two parts:

  1. A training set
  2. A testing set

A training set teaches our classifier what each category looks like. The classifier makes predictions on input data and then corrects itself if predictions are wrong.

After the classifier is trained, we evaluate the performance on a testing set.

You can split the training and testing set in the following ways:Pie Chart of train data and test data

Validation Set :

This data is from the training data and used as “fake test data” so we can tune our hyperparameters (Autotuning). We generally allocate 10%-20% of the training dataset for validation.

Step 3: Train Your Network

Once we are ready with all sets of the training data, we can start training our network. Our goal is to teach our neural network each category in our labeled data. When the model makes a mistake, it learns and improves itself.

Step 4: Evaluate

Last, we need to evaluate the performance of our trained network. We present each of the images in our testing dataset to the network and ask it to predict the label for that image. We tabulate the predictions of the trained model and compare them to the actual category of the image. Thus, we can determine the number of classifications our model got correct.

Image Classification Output

A deep-learning image classifier is ready, using a data-driven approach and supervised learning method. 

To create your own AI model – Click here

Authors:

Blog

Challenges in Image Classification

What is image classification?

Image classification means assigning labels to an image based on a predefined set of classes.

Practically, this means that our task is to analyze an input image and return a label that categorizes the image. The label is always from a predefined set of possible categories.
For example, let’s assume that our set of possible categories includes:
categories = {tom, jerry}

Our classification system could assign multiple labels to the image via probabilities, such as:

jerry: 95%; tom: 5% for the image on the left side.

jerry: 10%; tom: 90% for the image on the right side.

To learn how to create an image classifier – Click here

What are the challenges in image classification?

Below are the challenges that we face while doing image classification :

  1. Semantic gap: We can clearly see the difference between an image that contains a dog and an image that contains a cat. However, a computer sees a big matrix of numbers. The difference between how we perceive an image and how the computer sees the image(a matrix of numbers) is called the semantic gap.
Computer vision - how humans see and how computers see

Computer vision – difference between how humans see and computers see.

2. Viewpoint variation: Based on how the object is photographed and captured, it can be oriented in multiple dimensions.

Viewpoint Variation - a car captured from different angles

Viewpoint Variation – Same car from different angles

3. Scale variation: Have you ever ordered a small, medium,or large pack of fries at Mc Donald. They are all the same – a pack of fries, but of different sizes. Furthermore, the same pack of fries will look dramatically different when it is photographed up close versus when it is captured from farther away. The image classification methods must be tolerable to these types of scale variations.
Scale variation is a challenge of Image Classification

Scale Variation – Same pack of fries captured from different distances

4. Deformation : For those of you familiar with the television series Popeye, we can see Olive Oyl in the image. As we all know that Olive is elastic, stretchable, and capable of contorting her body in many different poses. We can look at these images of Olive as a type of object deformation – all images contain the Olive character; however, they are all dramatically different from each other.

Deformation is one of the challenges of Image Classification

  1. Occlusions: In the image on the left side, we have to have a picture of a cat. And in the image on the right side, we have a photo of the same cat. But how the cat is resting underneath the covers, occluded from our view. The cat is still clearly in both images – she’s just more visible in one image than the other. Image classification algorithms should still be able to detect and label the presence of the cat in both images.
    Image Classification Challenge - Occlusion

    Occlusion – The same object is more visible in one image

  2. Illumination: The image on the left side was photographed with standard overhead lighting while the image on the right side was captured with little lighting. We are still examining the same cupcake – but based on the lighting conditions, the cupcake looks dramatically different.

Background Clutter

Challenge of Image Classification - Illumination

Illumination – objects look different in different lighting

7. Background clutter: Ever played a game – to spot the bird? If so, then you know the goal of the game is to find the decided beautiful bird before the others. However, these games are more than just entertaining children’s game – they are also the perfect representation of background clutter. You can clearly see the Himalayan black-lored tit in the image on the left side. But the image on the right side is very noisy and has a lot of background clutter. We are interested in one particular object in the image; however, due to all the “noise”, it’s not easy to spot the bird. If it’s not easy for us to do, imagine how hard it is for a computer with no semantic understanding to spot it.

Image Classification challenge - Background clutter

Background clutter – The background noise makes it difficult to search the bird on the right side image

  1. Intra-class variation: The canonical example of intra-class variation in computer vision is displaying the diversification of dog breeds. We have different breeds of dogs some used for military, some as pets, some as guards – a dog is still a dog. Our image classification algorithms must be able to categorize these variations correctly.
    Intra-class variation is one of the challenges on Image Classification

    Intra-class variation – Different breeds of dogs

Watch this tutorial to create your own AI model – Click here

Authors:

 

Blog

History of Artificial Intelligence

Artificial Intelligence is not a concept of now, but of the ancient Greek times. The inanimate object can come to life is not just a concept in sci-fi movies, but is much older than you can imagine. There are myths of mechanical men and robots in ancient Greek and Egypt. However, John McCarthy coined the term Artificial Intelligence not before 1955. Let us glance through a brief history of AI:

History of AI - Alan Turing Test, AI program, John McCarthy, Eliza, Wabot, Boom of AI, World Chess Champion, 1st AI vacuum cleaner, AI in Netflix, Chatbot, Google Duplex.

What is AI? Check out this blog with exciting examples – Click here

Infobytes

Alan Turing TestTuring Test Description Artificial Intelligence

It is a test to determine whether a computer can think like a human being or not. 

It consists of three participants – a human evaluator(X) on one side and a human(A) and a computer(B) on the other side. If the evaluator (X) can’t recognize which candidate is human and which candidate is a computer after a series of questions, the computer successfully passes the Turing test. If the computer system successfully mimics the human, then it has passed the Alan Turing Test. 

To date, no AI has passed the Turing test, but some came pretty close.  

 
ELIZA – First Chatbot
Eliza - First Chatbot using Artificial Intelligence

The first chatbot

Bots are able to have human-like interaction because they are powered by two technologies – artificial intelligence and natural language processing that provides human-like intelligence to the bots.

ELIZA, aimed at tricking its users by making them believe that they were having a conversation with a real human being.

ELIZA operates by recognizing keywords or phrases from the input to reproduce a response using those keywords from pre – programmed responses.  For instance, if a human says that ‘My mother cooks good food’. ELIZA would pick up the word ‘mother’, and respond by asking an open- ended question ‘Tell me more about your family’. This created an illusion of understanding and having an interaction with a real human being though the process was a mechanized one.

 
WABOT
Robot - Wabot using Artificial Intelligence

First Robot – WABOT

This robot had hands and limbs that could extend and grab objects as well as legs that could walk in a rudimentary fashion. WABOT-1 also had semi-functional ears, eyes, and a mouth. The robot uses these sensory devices to communicate with a person in Japanese and estimate distances. Experts have estimated that WABOT had the mental faculty of a one-and-a-half-year-old child.

Computer beats the World Champion
World History - AI defeats human chess and becomes a champion

AI defeats the human opponent to become wold chess champion.

On May 11, 1997, an IBM computer called IBM Deep Blue beat the world chess champion after a six-game match: two wins for Deep Blue, one for the champion, and three draws. The match lasted several days and attracted massive media coverage around the world. It was the classic plot line of man vs. machine. It pushed forward the ability of computers to handle complex calculations needed to help discover new medical drugs; do broad financial modeling, identify trends, and do risk analysis; handle large database searches, and perform massive calculations needed in many fields of science.

This experiment formed the base for the upcoming parallel computing and Artificial Intelligent technologies.

 
Roomba – vacuum cleaner
First Vacuum Cleaner using Artificial Intelligence

First AI Vacuum Cleaner – Roomba

The battery-operated Roomba rolls on wheels and responds to its environment with the help of sensors and computer processing. When it bumps into an obstacle or detects an infrared beam, a boundary line the robot will change direction randomly.

 
Duplex
Google Duplex - AI assistant booking an appointment for haircut at a salon

Google Duplex

A technology that sounds natural to make customer experience comfortable. Duplex makes a phone call and arranges an appointment at a salon for the user. The receptionist at the salon doesn’t even realize that s/he was having a conversation with a machine(Duplex). 

The peak of AI is yet to come…

What are you waiting for? It is time to make your own AI model now – Click here

Author:

Blog

What is Artificial Intelligence

In your mind, how many times have you asked yourself: What is artificial intelligence? Let’s begin with a very simple analogy to understand Artificial Intelligence.

Working with AI is as good as parenting a child. Consider a pretty baby girl is born to a cute couple. When she is born, she knows nothing. The parents teach her that “Dear this is ‘a’, ‘b’, ‘c’. This is a fruit. You should stop when the signal shows red for people. When there is an obstacle in front of you, you should change your direction or move the obstacle”. 

Artificial Intelligence means parents teaching a child.

AI and robots are ancient Greek concepts – See here

Over time, she learns new things, sometimes with her parent’s help, sometimes on her own(machine learning). She begins anticipating and completing their statements based on past experiences. She knows, she will be punished for doing something wrong.

A computer system/device is the child, and we humans are its parents. We teach the skills of human beings to machines. We train the machines and give them their artificial intelligence.

Now, machines can identify things, learn, predict and mimic humans. These thinking machines have Artificial Intelligence. I think now you are ready for the technical definition of artificial intelligence.

Artificial Intelligence (AI) is the ability of computers to do things that are considered attributes of intelligence i.e. computers possess the ability to process language, understand pictures, perceive the environment, learn from the past, reason, etc.

Categories of AI

 

Weak AI

 

Narrow AI is also known as Weak AI. Weak AI can only perform specific tasks rather than possess full cognitive abilities like humans. Only Narrow AI exists today.

Some examples are – digital voice assistants, recommendation engines, search engines, chatbots, etc.

Artificial Intelligence used in chatbot

Strong AI

 

Strong AI is also known as Artificial Generalized Intelligence. It is only theoretical currently. Strong AI means the machines will have minds of their own. They will not need programming inputs. They can complete any task using their decision-making skills and full human cognitive abilities. The machine can feel, think, reason, remember, and act on its own. Machines behave like humans. Also, they can solve problems in a generalized way using their intelligence instead of performing some specific tasks only.

Super AI

 

There are no clear examples of strong artificial intelligence. Thankfully, the field of AI is rapidly innovating. A new AI theory has emerged, known as artificial superintelligence (ASI), superintelligence, or Super AI. This type surpasses strong AI in human intelligence and ability. However, Super AI is still purely speculative as we still have to achieve examples of Strong AI.

To create your own AI models – Click here

Author:

 

 

In-House AI models

Model Name – Satellite Aeroplane Detection

What does the model detect?
This model detects aeroplanes on the ground from satellite imagery.

What is the use of this model?
This model could serve in military applications and for better surveillance of airports.

Approach to creating a model in Vredefort

Step 1 – Dataset Collection
We collected satellite images of aeroplanes from the Google Earth website. In google earth, we kept a 2D view with a certain fixed height(80 m). We selected the 100 busiest airports and collected 1463 images of aeroplanes. There is only one class – Aeroplane.

Step 2 – Data Cleaning
After collecting the dataset, we uploaded it on Vredefort. Vredefort automatically cleans the data by removing the corrupt images and resizing them to a suitable resolution.

Step 3 – Data Annotation
The computer learns to detect objects from images through a process of labeling. Thus, we drew boxes around the concerned objects and labeled them as Aeroplane (only one object to detect).
We annotated 1463 images using the inbuilt Vredefort tool.

Annotation Rules – (Keep them in mind for better detection)
⦁ Skip the object if it is in motion or blur.
⦁ Precisely draw the bounding box around the object.
⦁ Bounding boxes should not be too large.

[Optional] Step 4 – Tuning Parameters
If you register as a developer and developer mode is on, you can modify the number of epochs, batch size per GPU, neural network model, etc. In case of no user inputs, the settings will change to default.

Step 5 – Training
The training process takes place automatically with a single click.

Evaluation of the model
After training, we can evaluate the model.
In evaluation, there are two parts. The first is accuracy and the second is to play inference videos. Vredefort enables us to obtain total model accuracy and class-wise accuracy. In this case, only one class is present. We achieved 76% model accuracy.

A new video for inference
We recorded video with the help of SimpleScreenRecorder with same 2D view and fixed height and that video used to check the inference. If the developer mode is on, it will ask to set confidence. You can set it as per your convenience. Here we set 0.1 [10%] confidence.

Model download and transfer learning from unpruned model
Vredefort provides one more feature to get the accuracy of the model. It allows you to download the model and dataset for further applications(like adding logic to your model). If you have downloaded model files, you can use the unpruned model (click here to know more about the unpruned model) for different datasets and save training time. You can generate alerts and write use-cases with that model.

Any challenges faced
None

Limitations
⦁ The model will work best on satellite imagery with 80m of height
⦁ The model is trained on satellite imagery and hence will work best on those images or video feeds.
⦁ It will struggle to detect aeroplanes from other sources such as mobile camera videos.

Improvements
More datasets can be collected to detect aeroplanes from different heights and sources to improve the model accuracy.

Model Details

Model Name – Satellite Aeroplane Detection
Dataset Images – 1463
Number of Labels – 1
Label name and count – aeroplanes (6605)
Accuracy – 76%

Download Links

Dataset Download – Download here

Model Download Link – Download here

Inference Video Link – Download here

Author:

In-House AI models

Model Name – Car parking occupancy Detection

What does the model detect?
This model detects occupancy for car parking in images/videos.

What is the use of this model?
Techniques for car parking occupancy detection are significant for the management of car parking lots. Knowing the real-time availability of free parking spaces and communicating it to the users helps reduce the queues, improve scalability, and minimize the time required to find a place in the parking lot. In many parking lots, ground sensors determine the status of the various spaces. These require expensive installation and maintenance of sensors in every parking space, especially in parking lots with more available spots. So this model can be used to overcome this problem.

Approach to creating a model in Vredefort

Step 1 – Dataset Collection
We collected 1497 dataset images from an open-source. These images were captured from a camera mounted on a building rooftop in front of a car parking lot. Then we split the dataset into train and test. There were two classes – Car and Vacant.

Step 2 – Data Cleaning
After collecting the dataset, we uploaded it on Vredefort. Vredefort automatically cleans the data by removing the corrupt images and resizing them to a suitable resolution.

Step 3 – Data Annotation
The computer learns to detect objects from images through a process of labeling. Thus, we drew boxes around the concerned objects and labeled them as car and vacant accordingly. We annotated 1497 images using the inbuilt Vredefort tool.
Annotation Rules – (Keep them in mind for better detection)
    ⦁ Skip the object if it is in motion or blur.
    ⦁ Precisely draw the bounding box on the object.
    ⦁ Bounding boxes should not be too large.

[Optional] Step 4 – Tuning Parameters
If you register as a developer and developer mode is on, you can modify the number of epochs, batch size per GPU, neural network model, etc. In case of no user inputs, the settings will change to default.

Step 5 – Training
The training process takes place automatically with a single click.

Evaluation of the model
After training, we can evaluate the model.
In evaluation, there are two parts. The first is accuracy and the second is to play inference videos. Vredefort enables us to obtain total model accuracy and class-wise accuracy. In this case, three classes are present. We achieved 40% model accuracy. Individual class accuracy is 64% for car and 16% for vacant.

A new video for inference
We made a video from test dataset images and used it for interference. If the developer mode is on, it will ask to set confidence. You can set it as per your convenience. Here we set 0.1 [10%] confidence.

Model download and transfer learning from unpruned model
Vredefort provides one more feature to get the accuracy of the model. It allows you to download the model and dataset for further applications(like adding logic to your model). If you have downloaded model files, you can use the unpruned model (click here to know more about the unpruned model) for different datasets and save training time. You can generate alerts and write use-cases with that model.

Any challenges faced
The model was not working well as the white parking lines were not visible clearly. The vacant spots were not detected precisely so we annotated more images to reduce
the errors.

Limitations
     ⦁ The model is trained on an outdoor camera and hence will work best on those images or video feed.
     ⦁ It will struggle to detect parking spots if the white lines are not visible.

Improvements
For more accuracy, collect the dataset from different angles, including complex environments, and balance the dataset for all the classes by reducing the mismatch in the number of images. You need not worry about class imbalance if images in your dataset are balanced for all classes.

Model Details
Model Name – Car parking occupancy Detection
Dataset Images – 1497
Number of Labels – 2
Label name and count – Car (7740), Vacant (8745)
Accuracy – 40%
Class Accuracy – Car (64%), Vacant (16%)

Download Links

Dataset Download –  Download here

Model Download Link – Download here 

Inference Video Link – Download here 

Author:

In-House AI models

Model Name – Satellite Ship Detection

What does the model detect?
This model detects ships in the sea from satellite imagery.

What is the use of this model?
This model empowers the government institutions for strict and finer maritime security surveillance. It helps to manage marine traffic at busy ports. The detection enables the concerned authorities to take quick decisions and reduce pirate threats.

Approach to creating a model in Vredefort

Step 1 – Dataset Collection
We collected satellite images of ships from the Google Earth website. In google earth, we kept a 2D view with a certain fixed height – 100m for big ships and 60 m for small ships. We selected the 50 busiest ports and collected 645 ship images. There is only one class – Ship.

Step 2 – Data Cleaning
After collecting the dataset, we uploaded it on Vredefort. Vredefort automatically cleans the data by removing the corrupt images and resizing them to a suitable resolution.

Step 3 – Data Annotation
The computer learns to detect objects from images through a process of labeling. Thus, we drew boxes around the concerned objects and labeled them as ship (only one object to detect).
We annotated 645 images using the inbuilt Vredefort tool.

Annotation Rules – (Keep them in mind for better detection)
    ⦁ Skip the object if it is in motion or blur.
    ⦁ Precisely draw the bounding box around the object.
    ⦁ Bounding boxes should not be too large.

[Optional] Step 4 – Tuning Parameters
If you register as a developer and developer mode is on, you can modify the number of epochs, batch size per GPU, neural network model, etc. In case of no user inputs, the settings will change to default.

Step 5 – Training
The training process takes place automatically with a single click.

Evaluation of the model
After training, we can evaluate the model.
In evaluation, there are two parts. The first is accuracy and the second is to play inference videos. Vredefort enables us to obtain total model accuracy and class-wise accuracy. In this case, only one class is present. We achieved 55% model accuracy.

A new video for inference
We recorded a video of a 2D view with fixed height using a SimpleScreenRecorder to check the inference. If the developer mode is on, it will ask to set confidence. You can set it as per your convenience. Here we set 0.1 [10%] confidence.

Model download and transfer learning from unpruned model
Vredefort provides one more feature to get the accuracy of the model. It allows you to download the model and dataset for further applications(like adding logic to your model). If you have downloaded model files, you can use the unpruned model (click here to know more about the unpruned model) for different datasets and save training time. You can generate alerts and write use-cases with that model.

Any challenges faced
Collecting the images was challenging due to security reasons at certain ports.

Limitations
    ⦁ The model will work best on satellite imagery with 100m of height for big ships and 60m of height for small ships.
    ⦁ The model is trained on satellite imagery and hence will work best on those images or video feeds.
    ⦁ It will struggle to detect ships from other sources such as mobile camera videos.

Improvements
More datasets can be collected to detect ships of different heights and from varied sources to improve the model accuracy.

Model Details
Model Name – Satellite Ships Detection
Dataset Images – 645
Number of Labels – 1
Label name and count – ship (1399)
Accuracy – 55%

Download Links

Dataset Download – Download here

Model Download Link – Download here

Inference Video Link – Download here

Author: