Feature Selection

#machine-learning #interview #need-review

Sometimes more is not better, especially in the case of machine learning if we give a lot of unnecessary features, then there is a good chance then the model can overfit with the features. Also, it will take more time to converge as the model has to learn which are the important features and which are not.

Feature Selection algorithms:

Depending on Data:

Percent of missing value
1. Remove the feature if most of them are missing
2. Otherwise impute (refer to Handling Missing Data)
Drop variables with zero Variance (no important information)

Depending on Redundancy

Pairwise correlation (refer to Pearson Correlation)
Multicollinearity to remove multiple co-related features
Use Principal Component Analysis (PCA) to reduce feature using crossing among features
1. good for introducing non-linearity
2. bad for interpretability
Use Cluster analysis to find out which features are related (refer to Hierarchical Clustering)
1. good for if the dataset has Multicollinearity

Greedy:

Embedding Method

Random Forest Importance Features
Feature Selection using Decision Tree
L1 or Lasso Regression
Elastic Net Regression
Decision Tree

Feature Selection

Feature Selection algorithms:

Depending on Data:

Depending on Redundancy

Greedy:

Embedding Method

Pros:

Cons:

Related Notes