Supervised And Unsupervised Learning Introduction

Message from the Writer

Author

Here you will learn about Supervised and Unsupervised Learning. I hope you will enjoy it. If you have any questions, please feel free to ask me in the comment section. I will try to answer your questions as soon as possible. Thank you for reading this article. Have a nice day.

Are you exited to learn about the Machine Learning?

a. Supervised learning

What is supervised learning?

⚠️

In simple term: Supervised learning is a machine learning technique where the model is trained using labeled data. The model learns from the labeled data and then predicts the output for the new data.The labelled data means some input data is already tagged with the correct output.

Supervised learning is a process of providing input data as well as correct output data to the machine learning model. The aim of a supervised learning algorithm is to find a mapping function to map the input variable(x) with the output variable(y).

In the real-world, supervised learning can be used for the following tasks:

Image classification
Speech recognition
Text classification
Medical diagnosis
Stock market prediction
Weather forecasting

Supervised Learning Works

In Supervised Learning, models are trained using labelled dataset where the input data is already tagged with the correct output.Once the training process is completed, the model is tested using the test dataset. The test dataset is the dataset that the model has never seen before. The model is tested on the test dataset to check the accuracy of the model. The accuracy of the model is measured by comparing the predicted output with the actual output.

Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and Polygon. Now the first step is that we need to train the model for each shape. For training the model, we need to provide the input data and the correct output data. The input data is the image of the shape and the correct output data is the name of the shape. The model is trained using the input data and the correct output data.

If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.
If the given shape has four sides, and all the sides are not equal, then it will be labelled as a Rectangle.
If the given shape has three sides, then it will be labelled as a Triangle.
If the given shape has more than four sides, then it will be labelled as a Polygon.
If the given shape has less than three sides, then it will be labelled as a None.
if the given shape has six equal sides and six equal angles, then it will be labelled as a Hexagon.

The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape on the bases of a number of sides, and predicts the output.

Steps Involved

→ First Determine the type of training dataset
→ Collect the training dataset
→ Split the dataset into training , testing dataset and validation dataset
→ Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc.
→ Train the model using the training dataset
→ Execute the algorithm on the training dataset. Sometimes we need validation sets as the control parameters, which are the subset of training datasets.
→ Evaluate the accuracy of the model by providing the test set. If the model predicts the correct output, which means our model is accurate.

Types

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the output variable. It is used for the prediction of continuous values. For example, if we want to predict the price of a house, then we can use regression algorithms. The output of the regression algorithm is a continuous value.

Below are some examples of regression algorithms:

→ Linear Regression
→ Regression Trees
→ Non-Linear Regression
→ Bayesian Linear Regression
→ Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which means there are two classes such as Yes-No, Male-Female, True-false, etc. For example, if we want to predict whether the customer will buy the product or not, then we can use classification algorithms. The output of the classification algorithm is a discrete value.

Below are the Spam Filtering,

→ Random Forest
→ Decision Trees
→ Logistic Regression
→ Support vector Machines

Advantages

With the help of supervised learning algorithms, we can predict the future outcomes.
In supervised learning, we can have an exact idea about the classes of objects and the relationship between the input and output variables.
Supervised learning model helps us to solve various real-world problems such as fraud detection, spam filtering, etc.
Supervised learning algorithms are easy to implement and understand.
Supervised learning algorithms are used for both classification and regression problems.

Disadvantages

Supervised learning models are not suitable for handling the complex problems.
Supervised learning cannot predict the correct output if the test dataset is not similar to the training dataset.
Training required lots of computation power and time.
In supervised learning, we need enough knowledge about the dataset to train the model.
Supervised learning algorithms are not suitable for handling the missing values in the dataset.
Supervised learning algorithms are not suitable for handling the noisy data.
Supervised learning algorithms are not suitable for handling the outliers in the dataset.

b. Unsupervised Learning

What is Unsupervised learning?

⚠️

In simple term In unsupervised learning, models are trained on unlabeled data and allowed to act on that data without supervision.

The problem with unsupervised learning is that unlike supervised learning, we don't have corresponding output data, only input data.Unsupervised learning is about finding the underlying structure of a dataset, grouping it based on similarity, and represent that dataset in a compressed format.

Example: Suppose we provide a cat and dog dataset to our unsupervised learning model. The algorithm never trained upon the given dataset which means it does not have any idea about the features of the dataset. Unsupervised learning algorithms identify images' features on their own. Clustering the image dataset into groups based on similarities between images will be performed by an unsupervised learning algorithm.

Why We Use ?

Unsupervised learning is used to find the hidden patterns in the data.
Unsupervised learning is much similar as a human learns to think by their own experiences, which makes it closer to the real AI.
Unsupervised learning relies on data that is unlabeled and uncategorized, which makes it more valuable.
We sometimes do not have input data corresponding to output in the real world, so we need unsupervised learning to deal with such cases.

Unsupervised Learning Works

Working of unsupervised learning can be understood by the below diagram:

Here you can see we take an unlabeled dataset and train the model on it. The model will try to find the hidden patterns in the dataset.Firstly, it will interpret the raw data to find the hidden patterns from the data and then will apply suitable algorithms such as k-means clustering, Decision tree, etc.Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to the similarities and difference between the objects.