Summary
This chapter explained the different tasks that can be solved through supervised learning algorithms: classification and regression. Although both of these tasks' goal is to approximate a function that maps a set of features to an output, classification tasks have a discrete number of outputs, while regression tasks can have infinite continuous values as outputs.
When developing machine learning models to solve supervised learning problems, one of the main goals is for the model to be capable of generalizing so that it will be applicable to future unseen data, instead of just learning a set of instances very well but performing poorly on new data. Accordingly, a methodology for validation and testing was explained in this chapter, which involved splitting the data into three sets: a training set, a dev set, and a test set. This approach eliminates the risk of bias.
After this, we covered how to evaluate the performance of a model for both classification and regression...