Cross-validation is a method of evaluating the generalization performance of a model that is generally more stable and thorough than splitting the dataset into training and test sets.
The most commonly used version of cross-validation is k-fold cross-validation, where k is a number specified by the user (usually five or ten). Here, the dataset is partitioned into k parts of more or less equal size, called folds. For a dataset that contains N data points, each fold should thus have approximately N / k samples. Then a series of models is trained on the data, using k - 1 folds for training and one remaining fold for testing. The procedure is repeated for k iterations, each time choosing a different fold for testing, until every fold has served as a test set once
An example of two-fold cross-validation is shown in the following figure: