Summary
In this chapter, we fitted and interpreted multiple linear and logistic regression models. We learned how to calculate the RMSE and MAE metrics, and checked their different responses to outliers. We generated model formulas and cross-validated them with the cvms package. To check whether our model is better than random guesses and making the same prediction every time, we created baseline evaluations for both linear regression and binary classification tasks. When multiple metrics (such as the F1 score and balanced accuracy) disagree on the ranking of models, we learned to find the nondominated models, also known as the Pareto front. Finally, we trained two random forest models and compared them to the best performing linear and logistic regression models.
In the next chapter, you will learn about unsupervised learning.