Decision Tree
- Decision tree can be used for Feature Selection because if some feature is not important enough then those feature won't be used to split the tree
There are 2 types of decision tree depending on the task
Advantages of Decision Tree
- Easy to interpret
- Explainable
- Good at Handling Outliers
- As the outliers are ignored when splitting tree
- Supports both classification and regression
- Supports both continuous and categorical variables
- Automatically good at Handling Missing Data
- Ignore missing values and focus on the available values
- Distribute to the majority class
- Distribute randomly
- Distribute evenly to all children
- Less training time compared to Random Forest as it only has to generate one tree rather than a forest of trees
- No feature scaling is needed as it use one feature at a time
Disadvantages of Decision Tree
- Overfitting
- Can be solved using tree pruning
- Retraining
- Adding of only a new item or a new feature can change the whole tree, so needs to be retrained the whole tree again
- Not suitable for large datasets
- The tree will grow to be very complex to fit all the data. One alternative is to use Random Forest