Feature Selection
While feature engineering ensures that the quality and data issues are rectified, feature selection helps with determining the right set of features for improving the performance of the model. Feature selection techniques identify the features that contribute the most in the prediction ability of the model. Features with less importance inhibit the model's ability to learn from the independent variable.
Feature selection offers benefits such as:
Reducing overfitting
Improving accuracy
Reducing the time to train the model
Univariate Feature Selection
A statistical test such as the chi-squared test is a popular method to select features with a strong relationship to the dependent or target variable. It mainly works on categorical features in a classification problem. So, for this to work on a numerical variable, one needs to make the feature into categorical using discretization.
In the most general form, chi-squared statistics could be computed as follows:
This tests whether or...