Summary
In this chapter we learned the basics of regression. We mainly covered linear regression, which is one of the easier-to-use models and is easier to interpret. The assumptions for linear regression were discussed, including linear relationships between the features and target, normal distributions of data, no multicollinearity, no autocorrelation of the target, and homoscedasticity (a uniform spread of residuals among target values). We saw how we can use regularization with linear regression to select features with L1 or Lasso regularization, since it will move some coefficients to 0. We also saw how we can try L2 or Ridge regression as well as using a combination of L1 and L2 with ElasticNet.
We saw how other sklearn
models can be used for regression as well and demonstrated the KNN model for this. Additionally, the statsmodels
package was used for linear regression to be able to get p-values for the statistical significance of our coefficients. Metrics for evaluating...