We're going to work with some ideas that are similar to those we saw in the recipe on LASSO regression. In that recipe, we looked at the number of features that had zero coefficients. Now we're going to take this a step further and use the sparseness associated with L1 norms to pre-process the features.
Feature selection on L1 norms
Getting ready
We'll use the diabetes dataset to fit a regression. First, we'll fit a basic linear regression model with a ShuffleSplit cross-validation. After we do that, we'll use LASSO regression to find the coefficients that are zero when using an L1 penalty. This hopefully will help us to avoid overfitting (when the model is too specific to the data it was trained on...