Naive Bayes

Naive bayes is formed from Bayes Theorem
Naive bayes is called naive as it assumes all the features are independent of each other, which is really no the case in real life

For example,
Given msg = "Dear Friends", predict if it is Spam or Not-Spam

\begin{aligned} P (S p a m | D e a r F r i e n d) & = \frac{P (D e a r F r i e n d | S p a m) P (S p a m)}{P (D e a r F r i e n d)} \\ = P (D e a r F r i e n d | S p a m) P (S p a m) \\ = P (D e a r | S p a m) P (F r i e n d | S p a m) P (S p a m) \end{aligned}

On the 2nd line, the denominator is ignored because it will be constant over all the data instances

Naive Bayes

$P (X | x_{1}, x_{2}, . . . x_{n}) = \prod_{i} p (x_{i} | X) P (X)$

Advantages of Naive Bayes?

Works very well with many number of features
Works well with large training dataset
Converges faster
Lesser overfitting
Good at Handling Outliers
Good at Handling Missing Data

Disadvantages of Naive bayes

Doesn't work if there are correlated features

Impact of missing value on Naive Bayes

Naive Bayes is good at Handling Missing Data as when calculating the probability it ignores the missing value rows, so the missing value has no impact over the probability; hence no impact on naive bayes

Impact of outliers on Naive Bayes

Basic Naive bayes is not good at Handling Outliers as if in the test time there comes a feature which was not in the train set then there will be 0 probability which will make the whole probability to 0, but most of the time these situation are handled by introducing an artificial count to every feature.

Problems that can be solved using Naive Bayes

Sentiment Analysis
Spam Classification
Twitter Sentiment Classification
Document Categorization

Problems that can be solved using Naive Bayes

Related Notes