Using regularization and other embedded methods
Regularization methods are embedded methods. Like wrapper methods, embedded methods evaluate features relative to a given algorithm. But they are not as expensive computationally. That is because feature selection is built into the algorithm already and so happens as the model is being trained.
Embedded models use the following process:
- Train a model.
- Estimate each feature's importance to the model's predictions.
- Remove features with low importance.
Regularization accomplishes this by adding a penalty to any model to constrain the parameters. L1 regularization, also referred to as lasso regularization, shrinks some of the coefficients in a regression model to 0, effectively eliminating those features.
Using L1 regularization
- We will use L1 regularization with logistic regression to select features for a bachelor's degree attainment model:We need to first import the required libraries...