Gini Impurity

Gini impurity is used to calculate impurity of a leaf node in Decision Tree.

Equation of Gini Impurity

Gini impurity of Leaf=1cC(Probability of Class c)2Total Gini Impurity of Tree=i# of elements in branchiTotal # of elements in all branchesGini Impurity of Branchi

Advantages of GINI Impurity

  1. SImple to calculate
  2. Interpretable
  3. Robust to overfitting

Disadvantages of GINI Impurity

  1. Not as effective as Entropy and Information Gain when classes are imbalanced
  2. Less sensitive to noise

Example

For the left tree

GI(Humidity==high)=13747GI(Humidity==normal)=11767GI(Humidity)=714GI(Humidity==high)+714GI(Humidity==normal)

Related Notes