Support Vector Machine in Machine Learning

Support Vector Machine(SVM) is an advanced way of classifying higher-dimensional data.

Before directly diving into the SVM let us see our Agenda :

(i) Introduction to Machine Learning

(ii) What is SVM?

(iii) What is SVM Kernels?

(iv) SVM Use Cases

(v) Real Life Example

(i) Introduction to Machine Learning

Machine Learning is the process of feeding machine enough data to train & predict a possible outcome using different algorithms. The more data feed to the machine, the more accurate the machine will become.

Basically, there are 3 types of Machine Learning.

(i) Supervised Learning

(ii) Unsupervised Learning

(iii) Reinforcement Learning

(i) Supervised Learning :

Supervised Learning is contained in a controlled way to oversee the outcome accordingly. In short in supervised learning machine learns what users want it to learn.

(ii) Unsupervised Learning :

In this machine simply explore the data which is given to it. The data is sometimes unlabeled and uncategorized with this, the machine makes the possible reference & predictions without any supervision.

(iii) Reinforcement Learning :

It basically means to enforce a pattern of behavior that a machine needs to establish a systematic pattern of approach in Reinforcement learning.

So these were the types of Machine Learning. Now let’s move on to the next topic.

(ii) What is the Support Vector Machine (SVM)?

The SVM was 1^st introduced in the 1960s & later improvised in the 1990s.

An SVM is supervised learning, Machine Learning classification algorithm which has become extremely popular because of it’s greater results.

An SVM is implemented in a slightly different way than other machine learning algorithms.

It is capable of performing classification, regression & outliers detection as well.

Now let's understand SVM with an example :

We have a labeled data set which contains multiple Quadrilaterals & Circles.

Now we are going to provide this data for training because we want to build a model.

Now our data set will get trained. After training, our model goes into the testing phase where our model predicts the output based on the test data which we provide to get the accurate output.

As we understood the concept of providing data to model & getting output.

Now let us see this in a graphical representation.

So as you can see above that our graph has 2 classes one of the Quadrilaterals & the second is of Circle. Now we have to separate it… Now the simplest way of separating 2 things is to draw a line between them. The line which we draw between the 2 classes is known as the Decision boundary also known as Hyperplane.

Why Hyperplane is called a Decision boundary?

-Because it is a kind of a boundary between the 2 classes which decides that the test data which we provide belong to a circle class or quadrilateral class.

Now you must be thinking why did that line have a specific location & angle? Why didn’t it’s angle is different? Like any of these :

Before diving into that answer. Let’s discuss something more basic i.e

(i) Linearly Separable Data

(ii) Non-Linearly Separable Data

(i) Linearly Separable Data :

So we have a graph which has 2 data classes which are Circle & Quadrilaterals. Now we have to separate those classes so we had drawn a decision boundary. By making a decision boundary which separates our data in 2 parts is known as Linearly Separable Classes.

Division of class based on just a single line is known as Linearly Separable Data. On the Linearly Separable Data, we can apply Linear Support Vector Machine.

(ii) Non-Linear Separable Data :

A given sample data that cannot be divided based on a single straight line is known as Non-Linear Separable Data.

Here we can see that our model is miss-classified & our model's accuracy will be 50% or less. That’s why it has a non-linear nature.

On such type of data, we apply something known as Non-Linear Support Vector Machine.

What is a Non-Linear Support Vector Machine?

Non-Linear Support Vector Machine is used when data the set is cannot be divided into 2 classes by simply drawing a straight line.

Here for an example, we took a data set, which we had the plot in 2D.

Now as we know that we cannot divide that data set in 2 classes. So to make it possible we will use Kernel Functions.

(iii) Kernel Function :

Basically, Kernel takes input as non-separable Low Dimensional Feature Space & give output as High Dimensional Feature Space.

Let’s take an example & understand this concept which will help us to understand things in a more easy way.

Example :

So here we took certain sample points which are Circles & Quadrilaterals

If you have noticed we cannot separate it as a separate quadrilateral class or a separate circle class.

So what we can do is that we can provide this Low Dimensional Feature Space data to the kernel, which will, in turn, generate that data into High Dimensional Feature Space i.e our data which was of 1 dimension get converted into 2 dimensions.

So now if we see that 2-dimensional graph. We would get a graph which is easily separable which a straight line into 2 classes.

Let’s make this thing a bit complicated. So we take this 2-dimensional graph from above & give that to the kernel as input & what kernel will give us output is a 3 dimensional graph.

Now you can easily see how hyperplane is separating all the data points into 2 classes i.e quadrilateral & circle.

This process can be done for 3D, 4D & so on… Just give our data non-separable low-dimensional feature space data to kernel & it will give you high-dimensional feature space.

Types of Kernels :

Basically, there are 3 types of Kernels.

(i) Linear Kernel

(ii) Polynomial Kernel

(iii) Radial Basis Function Kernel

(i) Linear Kernel :

A linear kernel can be used as a normal dot product between any 2 given observations so the product between 2 vectors is the multiplication of each pair of input values.

(ii) Polynomial Kernel :

Polynomial is a generalized form of Linear Kernel it can distinguish curve & non-linear input space as well.

(iii) Radial Basis Function Kernel

Radial Basis Function Kernel(RBF) kernel is also used in SVM classification it can map the space infinite dimensions.

Now lets come back to those questions again.

Why did that line have a specific location & angle? Why didn’t it’s angle is different? Like any of these :

So the answer to that question is simple that margin which is generated should have the highest value i.e the width of the margin should be maximum.

Now to simplify this, have you noticed that there are 2 line which is parallel to that hyperplane. You must be thinking about where it comes from & why it’s important. Let me explain…

So we have 2 classes one is of Circle & second is of quadrilateral.

Let’s take circle class for now. See carefully & take a single circle that is closest to the opponent class i.e quadrilateral class. Now near to the circle draw a straight line which is should be parallel to our hyperplane.

Now let’s take a quadrilateral class. See carefully & take a single quadrilateral which is closest to the opponent class i.e circle class. Now near to the quadrilateral draw a straight line which is should be parallel to our hyperplane.

After drawing 2 parallel lines now we have 2 distances i.e D- & D+ where D- is the distance towards negative point & D+ is the distance towards a positive point.

If we sum them up… Then we will get the value of margin.

Formula :

Margin = (D-) + (D+)

Now if have thought what the word Support Vector means?

Then here is your answer. The circle & quadrilateral to whom we have considered as closest to the opponent class & draw a line near to it. That quadrilateral & circle is known as the Support vector.

(iv) SVM Use Cases :

Support Vector Machine can be used for many things but few of them are :

- Face Detection

- Image Classification

- Handwriting Detection

- Remote Homology Detection

- Text & Hypertext Categorization

(v) SVM Example :

SVM on Breast Cancer Dataset

# Import all libraries

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Import Dataset

from sklearn import datasets

cancer_data = datasets.load_breast_cancer()

# Creating Train & Test Data

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(cancer_data.data, cancer_data.target, test_size=0.2)

# Import Support Vector Machine

from sklearn import svm

cls = svm.SVC(kernel='linear')

# Training the model

cls.fit(X_train,y_train)

Output :

# Predict the model

from sklearn import metrics
pred = cls.predict(X_test)
print("Accuracy :", metrics.accuracy_score(y_test,y_pred = pred))

Output :
Accuracy : 0.9473684210526315

# Precision score

print("Precision :", metrics.precision_score(y_test,pred))

Output :
Precision : 0.9506172839506173

# Recall score

print("Recall :", metrics.recall_score(y_test,pred))

Output :
Recall: 0.9746835443037974

# Classification Report

from sklearn.metrics import classification_report
print("Classification Report :",classification_report(y_test,y_pred=pred))

Output :

Hey Guys! So this was Support Vector Machine for you. If you have any doubts please leave a comment below & make sure you share this with your friends.

InfinityCodeX

Support Vector Machine in Machine Learning

(i) Introduction to Machine Learning

(ii) What is SVM?

(iii) What is SVM Kernels?

(iv) SVM Use Cases

(v) Real Life Example

(i) Introduction to Machine Learning

(i) Supervised Learning :

(ii) Unsupervised Learning :

(iii) Reinforcement Learning :

(ii) What is the Support Vector Machine (SVM)?

Why Hyperplane is called a Decision boundary?

(i) Linearly Separable Data :

(ii) Non-Linear Separable Data :

What is a Non-Linear Support Vector Machine?

(iii) Kernel Function :

(i) Linear Kernel :

(ii) Polynomial Kernel :

(iii) Radial Basis Function Kernel

(iv) SVM Use Cases :

(v) SVM Example :

You May Also Like

1 comment:

Subscribe

Categories

Blog Archive

Recent Posts

Pages

Random Posts

Tags

Popular Posts