Underfitting, overfitting, and cross-validation
What is cross-validation and why is it needed? To talk about cross-validation, we must formally introduce two other important concepts first: underfitting and overfitting.
In order to obtain a good model for either a regression problem or a classification problem, we must fit the model with the data. The fitting process is usually referred to as training. In the training process, the model captures characteristics of the data, establishes numerical rules, and applies formulas or expressions.
Note
The training process is used to establish a mapping between the data and the output (classification, regression) we want. For example, when a baby learns how to distinguish an apple and a lemon, they may learn how to associate the colors of those fruits with the taste. Therefore, they will make the right decision to grab a sweet red apple rather than a sour yellow lemon.
Everything we have discussed so far is about the training technique...