Support Vector Machine (SVM)

Intuition of the math behind SVM

If two support vector from both side are x1 and x2 respectively, then we need to find a line which will maximize (x2 - x1). If the equation of the hyperplane is y=WX+b, then for the support vectors the equation will be

Wx1+b=βˆ’1....(i)Wx2+b=+1.....(ii)

Subtracting (i) from (ii), we get,

W(x2βˆ’x1)=2W||W||(x2βˆ’x1)=2||W||(x2βˆ’x1)=2||W||

As we need to maximize (x2βˆ’x1), so it means maximizing 2||W|| or minimizing ||W||

Equation of SVM

wxiβˆ’bβ‰₯+1Β ifΒ yi=+1wxiβˆ’bβ‰€βˆ’1Β ifΒ yi=βˆ’1

In combination,

yi(wxiβˆ’b)β‰₯1

Minimize Euclidian Norm ||w|| subject to yi(wxiβˆ’b)β‰₯1

Finally, we need to find (W,b) to minimize ||W|| such that yi(wxiβˆ’b)β‰₯1

Equation of Soft SVM

Soft SVM is used to make the mode more robust and generalized. We allow to have some amount of errors so that the model doesn't overfit the train data.

minΒ ||W||+ciβˆ‘i|(yiβˆ’y^i)|

where, ci is the penalty of the error.

Basic Assumption of SVM

There is no basic assumption of SVM

Advantages of SVM

  1. SVM is more effective for higher dimensional data
  2. Can be used for unstructured data like text, images
  3. With proper kernel function, any problem can be solved
  4. With soft SVM very less chance of overfitting
  5. Very memory efficient, as we only need (W, b) during inference

Disadvantages of SVM

  1. Long training time
  2. It is difficult to choose a good kernel function
  3. Bad at Handling Outliers
  4. Bad at Handling Missing Data
  5. Bad at Handling Imbalanced Dataset

References:

  1. Good to understand the math

Related Notes