Machine Learning:Definition, Types, Applications
It has been a long since the word “MACHINE LEARNING” is being used at so many places and yet there are a lot of people out there who still don’t know what the term means. In this post, I will introduce the topic of machine learning and also explain the four main types of machine learning.
Machine learning simply means that the machine itself understands the situation and learn to solve the problem on its own.
In 1959, Arthur Samuel created a checkers game on machine, which ultimately became an expert in it even when Arthur himself did not know how to play well. What he did was just provide the how to play rules to the machine. And then let the machine play, gain experience and learn on its own. The machine learned how each move affected the next one and pretty soon it was an expert in the game.
Arthur Samuel defines machine learning as a field of study that gives computers the ability to learn without being explicitly programmed.
Here are the four types in which machine learning is classified.
Supervised Machine Learning
Before diving into the details of supervised machine learning, I will give a brief about it with supervising machine learning you feed the output of the algorithm into the system which means in supervised machine learning, the machine already knows the output of the algorithm before it starts learning it. Click here to learn about the types of supervised machine learning.
The basic example of this concept would be a student learning a course from an instructor. Here, the students know what he/she is learning from the course.
Supervise learning is the type of system in which both the input and output data is provided. Input and output data are labeled. The term supervise learning comes for an idea that an algorithm is learning from a training data set; which is similar to a teacher.
In supervised machine learning, both the labeled and raw input data is available for the machine to learn from. So this makes it easier for the machine to learn concepts.
- It can optimize performance criteria using experience. In other words, accuracy is very high.
- We already know the general output.
- It can solve various types of real-world computation problems.
Some of the disadvantage of supervised machine learning is :
- It can not handle complex tasks as we have to provide both algorithms and labeled data.
- If the classification is very broad, it would be a real challenge to classify the data.
- It can not cluster classified data on its own which means it can not learn by itself.
- It can not give you unknown information.
Supervised machine learning can have various applications in real-world such as classifying problems, people analytics, time -series forecasting, marketing, and sales.
Let’s take an example of application in Marketing and sales.
Digital marketing and online-driven sales are the first application fields that you may think of for machine learning adoption. People interact with the web and leave a piece of information. Hence, we analyze this information.
If you’ve been tracking most of your customers and accurately documenting their in-funnel and further purchase behavior, you have enough data to make predictions about most budding customers early and target sales efforts toward them.
The churn rate defines the number of customers who cease to complete target actions (e.g. add to cart, leave a comment, checkout, etc.) during a given period. Similar to lifetime value predictions, sorting “likely-to-churn-soon” from engaged customers will allow you to:-
1) Analyze the reasons for such behavior,
2) Refocus, and personalize offerings for different groups of churning customers.
Recommendation sections are something we can’t imagine modern eCommerce or media without. The common practice is to recommend other popular products or the ones you want to sell the most. It doesn’t require machine learning algorithms at all. But if you want to engage customers with deep personalization, you can apply machine learning techniques. This defines the products that this customer is most likely to buy next and put them on top of the recommendation list. Also, Netflix, YouTube, and other video streaming services operate in a similar way, tailoring their recommendations to a viewer’s lifetime behavior.
Unsupervised Machine Learning
Unsupervised machine learning is a machine learning technique where you do not need to supervise the model. It is the training of the machine using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. Here the task of the machine is to group unsorted information according to similarities, patterns, and differences. Unlike supervised machine learning, here no training is given to the machine. Therefore, the machine has to find the hidden structure in unlabeled data by itself. For instance, if we give the machine a set of pictures that include dogs and cats; the machine has no prior information about dogs and cats in any way. But it can differentiate the features and sort them into different groups.
- A machine can determine different features of the input.
- The machine has no prior knowledge.
- Easier for learning larger and complex data.
- It is a fast process.
- The output is not always as required.
- The accuracy of the output is less.
- The interpretation of output is difficult.
- The output is unlabeled.
Clustering is the assignment of a set of observations into subsets (known as clusters) so that observations in the same cluster are similar in some sense. It is a method of unsupervised learning and a common technique for statistical data analysis used in many fields.
K-means clustering is an algorithm to classify or to group your objects based on attributes/features into K number of group. K is a positive integer number.
The grouping happens by minimizing the sum of squares of distances between data and the corresponding cluster centroid. Thus the purpose of K-mean clustering is to classify the data.
If the number of data is less than the number of the cluster then we assign each data as the centroid of the cluster. Each centroid will have a cluster number.
If the number of data is bigger than the number of clusters, for each data, we calculate the distance to all centroid and get the minimum distance. This data belongs to the cluster that has a minimum distance from this data.
Semi-Supervised Machine Learning
The most basic disadvantage of any Supervised Learning algorithm is that the dataset has to be hand-labeled either by a Machine Learning Engineer or a Data Scientist. To overcome the problem the semi-supervised learning comes in to picture.
Semi-supervised learning is a combination of supervised and unsupervised machine learning. In semi-supervised learning, we are trying to solve the supervised learning approach using labeled data augmented by unlabelled data; the number of unlabelled or partially labeled samples is often larger than the number of labeled samples. one may imagine the three types of learning algorithms as Supervised learning where a student is under the supervision of a teacher at both home and school, Unsupervised learning where a student has to figure out a concept himself and Semi-Supervised learning where a teacher teaches a few concepts in class and gives questions as homework which are based on similar concepts.
- Firstly, affordable -The acquisition of unlabeled data is relatively cheap while labeling the said data is very expensive.
- Easier to obtain- unlabeled data is easy to obtain as compared to labeled data.
- Medical application-require experts’ opinions which might not be unique.
- Speech Analysis: As we know the labeling of audio files is a very intensive task. Hence, Semi-Supervise learning is a very natural approach to solve this problem.
- The quality of production is low.
- Less accuracy- basic methods are very simple to implement and give up to 5-10% accuracy
- Old data may screw up- if learning running in production be aware that data pattern change and old assumptions about labels may screw up new unlabelled data.
- An algorithm may disturb- while running semi-supervised learning in the production environment one must keep an eye on the algorithm.
Face recognition, as one of the most successful applications of semi-supervised learning. The traditional supervised learning methods require a large number of labeled face images to achieve good performance. We now have introduced a semi-supervised face recognition method. In semi-supervised linear discriminant analysis (SDA) and affinity propagation (AP) are integrated into a self-training framework. In particular, we employ SDA to compute the face subspace using both labeled and unlabeled images. Whereas we use AP to identify the exemplars of different face classes in the subspace.
The unlabeled data can then be classified according to the exemplars and the newly labeled data. The highest confidence is added to the labeled data, and the whole procedure iterates until convergence. We carry out a series of experiments on four face datasets to evaluate the performance of our algorithm. Experimental results illustrate that our algorithm outperforms the other unsupervised, semi-supervised, and supervised methods.
Reinforcement Machine Learning
Reinforcement learning is an area of Machine Learning where it’s all about taking suitable action to maximize reward in a particular situation. We employ reinforcement machine learning for various softwares and machines. This is done so as to find the best possible behavior or path it should take in a specific situation. Reinforcement learning differs from supervised learning in a way that in supervised learning the training data has the answer key with it so the model is trained with the correct answer itself whereas, in reinforcement learning, there is no answer. But the reinforcement agent decides what to do to perform the given task. In the absence of a training dataset, it is bound to learn from its experience.
- The model can correct the errors occurred during the training process. Once an error is corrected by the model, the chances of occurring the same error are very less.
- In the absence of a training dataset, the machine has to learn from its experience.
- It can create the perfect model to solve a particular problem. Reinforcement learning intended to achieve the ideal behavior of a model within a specific context, to maximize its performance.
- Reinforcement learning algorithms maintain a balance between exploration and exploitation. Exploration is the process of trying different things to see if they are better than all the previous tries. Exploitation is the process of trying the things that have worked best in the past. Other learning algorithms do not perform this balance.
- As reinforcement learning is a self learning process. Hence there is no predefined time period for a machine to learn to perform a particular task
- Reinforcement learning needs a lot of data and a lot of computation. It is data-hungry.
- As the training is complex and time-consuming therefore reinforcement learning is not preferable to use for solving simple problems.
- The credit assignment problem (CAP) is the problem of determining the actions that lead to a certain outcome. For example, in football, at each second, each football player takes action. In this context, an action can be to “pass the ball”, “run” or “shoot the ball”. At the end of the football match, the outcome can either be a victory, a loss, or a tie. The problem of determining the contribution of each player to the result of the match is the (temporal) credit assignment problem.
Robotics researchers are testing reinforcement learning as a way to simplify and speed up the programming of robots that do factory work
Industrial robots are capable of extreme precision and speed. But normally the programming happens very carefully in order to do something like grasp an object. This is difficult and time-consuming, and it means that such robots can usually work only in tightly controlled environments. Fanuc, a Japanese company has robots that use reinforcement learning to train itself, over time, how to learn a new task. It tries picking up objects while capturing video footage of the process. Each time it succeeds or fails, it remembers how the object looked. This knowledge is used to refine a large neural network, that controls its action. This learning technique has proved to be a powerful approach in speeding up various tasks in industries in an efficient way over the past few years.