Maximal Margin Classifier in SVM
Maximal Margin Classifier in SVM
In this blog, we will discuss the concept of the Maximal Margin Classifier in SVM.
It is important to understand the concept of hyperplane to understand the concept of SVM before understanding the Maximal Margin Classifier in SVM. It is basically a boundary that separates the dataset into different classes. For example, below are data points for the email dataset which is of ‘spam’ or ‘ham’.
A line like below can be drawn, which can be divide data into spam or ham. This is called a hyperplane which is in 2D.
The equation of a line is
where a, b are coefficients.
We can generalize the hyperplane as below:
where
The data points belong to the positive side are strictly and point belong to another side of hyperplane us
.
The concept of hyperplane can be extended to n-dimension. In the case of 3-dimension, the hyperplane will be a plane as shown below.
Now that you have understood about hyperplane, the next question is which line or plane is the actual hyperplane. As shown in the below diagram, there could be many hyperplanes that can classify classes separately.
To find the best hyperplane should consider two scenarios:
- The plane should discriminate between classes.
- It should maintain equidistant from closet data points of different classes.
The hyperplane should maintain an equal margin from the closet point from both classes as shown below.
Let’s deep dive more to understand in details Maximal Margin Classifier in SVM.
The math behind the distance between a point(x) and hyperplane.
Let’s consider a point ‘X’ and ‘d’ is the distance of the point from the hyperplane and perpendicular to the plane.
is the point on the hyperplane and the distance between X and
is below.
As the perpendicular line is parallel to , we can rewrite it as below.
Therefore
The same distance can also be found using the distance rule.
Based on the below rule to find the distance from any point to a line
Following the above rule, the distance of the hyperplane will be
Now let’s maximize the margin such that each data point can be classified correctly.
We know that data points on either side of the hyperplane should follow the below criteria.
Both can be combined into:
Based on Equation-2,
Substituting the above value, we get
We can rewrite this as below.
This is a constraint optimization problem and this can be solved Lagrangian multiplier method. Upon solving this equation we get below.
Once we find ‘w’ then we can find the distance(d) which is nothing but the margin.
In this blog, you learned about Maximal Margin Classifier in SVM. Follow this link to learn more about Maximal Margin Classifier in SVM. Tune in to other pages to find more information about Data Science related stuff.
This is exactly what I was looking for.
This saved my exam.