We have covered a lot of ground in this chapter, and introduced many important machine learning concepts. The first step in tackling a supervised learning problem is to collect and preprocess the data, making sure that it is normalized, and split into training and validation sets. We covered a range of different algorithms for both classification and regression. In each example, there were two phases: training the algorithm, followed by inference; that is, using the trained model to make predictions from new input data. Whenever you try a new machine learning technique on your data, it is important to keep track of its performance against the training and validation datasets. This serves two main purposes: it helps you diagnose underfitting/overfitting and also provides an indication of how well your model is working.
It is usually best to choose the simplest model that...