Performing forward-chaining cross-validation
Forward-chaining cross-validation, also called rolling-origin cross-validation, is similar to k-fold cross-validation but is better suited to sequential data such as time series. There is no random shuffling of data to begin with, but a test set may be set aside. The test set must be the final portion of data, so if each fold is going to be 10% of your data (as it would be in 10-fold cross-validation), then your test set will be the final 10% of your date range.
With the remaining data, you choose an initial amount of data to train on, let’s say five folds in this example, and then you evaluate on the sixth fold and save that performance metric. You retrain now on the first six folds and evaluate on the seventh. You repeat this until all folds are exhausted and again take the average of your performance metric. The folds using this technique would look like this:
Figure 12.4 – Forward-chaining...