In order to better explain the curse of dimensionality and the problem of overfitting, we are going to go through an example in which we have a set of images. Each image has a cat or a dog in it. So, we would like to build a model that can distinguish between the images with cats and the ones with dogs. Like the fish recognition system in Chapter 1, Data science - Bird's-eye view, we need to find an explanatory feature that the learning algorithm can use to distinguish between the two classes (cats and dogs). In this example, we can argue that color is a good descriptor to be used to differentiate between cats and dogs. So the average red, average blue, and average green colors can be used as explanatory features to distinguish between the two classes.
The algorithm will then combine these three features in some way to form a decision boundary...