Selecting features for classification models
The most straightforward feature selection methods are based on each feature's relationship with a target variable. The next two sections examine techniques for determining the k best features based on their linear or non-linear relationship with the target. These are known as filter methods. They are also sometimes called univariate methods since they evaluate the relationship between the feature and the target independent of the impact of other features.
We use somewhat different strategies when the target is categorical than when it is continuous. We'll go over the former in this section and the latter in the next.
Mutual information classification for feature selection with a categorical target
We can use mutual information classification or analysis of variance (ANOVA) tests to select features when we have a categorical target. We will try mutual information classification first, and then ANOVA for comparison.
...