Maximal Margin Classifier in SVM

Maximal Margin Classifier in SVM

In this blog, we will discuss the concept of the Maximal Margin Classifier in SVM.

It is important to understand the concept of hyperplane to understand the concept of SVM before understanding the Maximal Margin Classifier in SVM. It is basically a boundary that separates the dataset into different classes. For example, below are data points for the email dataset which is of ‘spam’ or ‘ham’.

1

A line like below can be drawn, which can be divide data into spam or ham. This is called a hyperplane which is in 2D.

2

The equation of a line is

CodeCogsEqn39

where a, b are coefficients.
We can generalize the hyperplane as below:

CodeCogsEqn40

where

CodeCogsEqn41 and  CodeCogsEqn42  are weights.

CodeCogsEqn43 and CodeCogsEqn44 are attributes.

The data points belong to the positive side are strictly  CodeCogsEqn45  and point belong to another side of hyperplane us  CodeCogsEqn46.

3

The concept of hyperplane can be extended to n-dimension. In the case of 3-dimension, the hyperplane will be a plane as shown below.

4

Now that you have understood about hyperplane, the next question is which line or plane is the actual hyperplane. As shown in the below diagram, there could be many hyperplanes that can classify classes separately.

5

To find the best hyperplane should consider two scenarios:

  1. The plane should discriminate between classes.
  2. It should maintain equidistant from closet data points of different classes.

The hyperplane should maintain an equal margin from the closet point from both classes as shown below.

6

Let’s deep dive more to understand in details Maximal Margin Classifier in SVM.

The math behind the distance between a point(x) and hyperplane. 

A hyperplane is defined as:
CodeCogsEqn 2

 

Let’s consider a point ‘X’ and ‘d’ is the distance of the point from the hyperplane and perpendicular to the plane.

7CodeCogsEqn2  is the point on the hyperplane and the distance between X and CodeCogsEqn2 is below.

CodeCogsEqn3

As the perpendicular line is parallel to CodeCogsEqn6 , we can rewrite it as below.

CodeCogsEqn8  where  CodeCogsEqn9 is a constant.

As  CodeCogsEqn10  lies on the plane CodeCogsEqn 2

Therefore

CodeCogsEqn11

CodeCogsEqn12

Equation-1     ——> Equation-1

CodeCogsEqn14

As  CodeCogsEqn8

CodeCogsEqn15

CodeCogsEqn16

Replacing CodeCogsEqn9 as Equation-1

CodeCogsEqn17

CodeCogsEqn18

CodeCogsEqn19

The same distance can also be found using the distance rule.

Based on the below rule to find the distance from any pointCodeCogsEqn22 to a line CodeCogsEqn21 1

CodeCogsEqn23 1

Following the above rule, the distance of the hyperplane will be

CodeCogsEqn19

Now let’s maximize the margin such that each data point can be classified correctly.

We know that data points on either side of the hyperplane should follow the below criteria.

CodeCogsEqn24  where  CodeCogsEqn26

CodeCogsEqn25 where  CodeCogsEqn27

Both can be combined into:

CodeCogsEqn28  ——–> Equation-2

To maximize margin  CodeCogsEqn29

CodeCogsEqn31

Based on Equation-2,

CodeCogsEqn32

Substituting the above value, we get

CodeCogsEqn33

= CodeCogsEqn34

We can rewrite this as below.

=CodeCogsEqn35

This is a constraint optimization problem and this can be solved Lagrangian multiplier method. Upon solving this equation we get below.

CodeCogsEqn36

CodeCogsEqn37

CodeCogsEqn38

Once we find ‘w’ then we can find the distance(d) which is nothing but the margin.

In this blog, you learned about Maximal Margin Classifier in SVM. Follow this link to learn more about Maximal Margin Classifier in SVM. Tune in to other pages to find more information about Data Science related stuff.

2 thoughts on “Maximal Margin Classifier in SVM

Leave a Reply