Summary
In this chapter, we discussed some fundamental concepts shared by almost any machine learning model.
In the first part, we introduced the data generating process, as a generalization of a finite dataset, and discussed the structure and properties of a good dataset. We discussed some common preprocessing strategies and their properties, such as scaling, normalizing, and whitening. We explained the most common strategies to split a finite dataset into a training block and a validation set, and we introduced cross-validation, with some of the most important variants, as one of the best approaches to avoid the limitations of a static split.
In the second part, we discussed the features of a machine learning model, and the concept of learnability. We discussed the main properties of an estimator: capacity, bias, and variance. We also introduced the Vapnik-Chervonenkis theory, which is a mathematical formalization of the concept of representational capacity, and we analyzed the effects of high biases and high variances. In particular, we discussed effects called underfitting and overfitting, defining the relationship with high bias and high variance.
In the next chapter, Chapter 2, Loss functions and Regularization , we're going to introduce loss and cost functions, which provide a simple and effective tool to fit machine learning models by minimizing an error measure or maximizing a specific objective.