Summary
In this chapter, we introduced the final details of logistic regression and continued to understand how to use scikit-learn to fit logistic regression models. We gained more visibility into how the model fitting process works by learning about the concept of a cost function, which is minimized by the gradient descent procedure to estimate parameters during model fitting.
We also learned of the need for regularization by introducing the concepts of underfitting and overfitting. In order to reduce overfitting, we saw how to adjust the cost function to regularize the coefficients of a logistic regression model using an L1 or L2 penalty. We used cross-validation to select the amount of regularization by tuning the regularization hyperparameter. To reduce underfitting, we saw how to do some simple feature engineering with interaction features for the case study data.
We are now familiar with some of the most important concepts in machine learning. We have, so far, only used...