Eliminating features recursively in a classification model
RFE can also be a good choice for classification problems. We can use RFE to select features for a model of bachelor's degree completion. You may recall that we used exhaustive feature selection to select features for that model earlier in this chapter. Let's see whether we get better accuracy or an easier-to-train model with RFE:
- We import the same libraries we have been working with so far in this chapter:
import pandas as pd from feature_engine.encoding import OneHotEncoder from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.ensemble import RandomForestClassifier from sklearn.feature_selection import RFE from sklearn.metrics import accuracy_score
- Next, we create training and testing data from the NLS educational attainment data:
nls97compba = pd.read_csv("data/nls97compba.csv") feature_cols = ['satverbal','satmath...