# Maximal Margin Classifier in SVM

## Maximal Margin Classifier in SVM

In this blog, we will discuss the concept of the Maximal Margin Classifier in SVM.

It is important to understand the concept of hyperplane to understand the concept of SVM before understanding the Maximal Margin Classifier in SVM. It is basically a boundary that separates the dataset into different classes. For example, below are data points for the email dataset which is of ‘spam’ or ‘ham’.

A line like below can be drawn, which can be divide data into spam or ham. This is called a hyperplane which is in 2D.

The equation of a line is

where a, b are coefficients.
We can generalize the hyperplane as below:

where

and    are weights.

and are attributes.

The data points belong to the positive side are strictly    and point belong to another side of hyperplane us  .

The concept of hyperplane can be extended to n-dimension. In the case of 3-dimension, the hyperplane will be a plane as shown below.

Now that you have understood about hyperplane, the next question is which line or plane is the actual hyperplane. As shown in the below diagram, there could be many hyperplanes that can classify classes separately.

To find the best hyperplane should consider two scenarios:

1. The plane should discriminate between classes.
2. It should maintain equidistant from closet data points of different classes.

The hyperplane should maintain an equal margin from the closet point from both classes as shown below.

Let’s deep dive more to understand in details Maximal Margin Classifier in SVM.

#### The math behind the distance between a point(x) and hyperplane.

A hyperplane is defined as:

Let’s consider a point ‘X’ and ‘d’ is the distance of the point from the hyperplane and perpendicular to the plane.

is the point on the hyperplane and the distance between X and is below.

As the perpendicular line is parallel to , we can rewrite it as below.

where  is a constant.

As    lies on the plane

Therefore

——> Equation-1

As

Replacing as Equation-1

The same distance can also be found using the distance rule.

Based on the below rule to find the distance from any point to a line

Following the above rule, the distance of the hyperplane will be

Now let’s maximize the margin such that each data point can be classified correctly.

We know that data points on either side of the hyperplane should follow the below criteria.

where

where

Both can be combined into:

——–> Equation-2

To maximize margin

Based on Equation-2,

Substituting the above value, we get

=

We can rewrite this as below.

=

This is a constraint optimization problem and this can be solved Lagrangian multiplier method. Upon solving this equation we get below.

Once we find ‘w’ then we can find the distance(d) which is nothing but the margin.