In k-fold cross-validation, we basically do holdout cross-validation many times. So in k-fold cross-validation, we partition the dataset into k equal-sized samples. Of these many k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k−1 subsamples are used as training data. This cross-validation process is then repeated k times, with each of the k subsamples used exactly once as the validation data. The k results can then be averaged to produce a single estimation.
The following screenshot shows a visual example of 5-fold cross-validation (k=5) :
Here, we see that our dataset gets divided into five parts. We use the first part for testing and the rest for training.
The following are the steps we follow in the 5-fold cross-validation method:
- We get the first estimation of our evaluation metrics...