Predictive accuracy measures
In the previous example, it is more or less easy to see that the order 0 model is very simple and the order 5 model is too complex, but what about the other two? How we can distinguish between those options? We need a more principled way of taking into account the accuracy on one side and the simplicity on the other. Two methods to estimate the out-of-sample predictive accuracy using only the within-sample data are:
Cross-validation: This is an empirical strategy based on dividing the available data into subsets that are used for fitting and evaluation in an alternated way
Information criteria: This is an umbrella term for several relatively simple expressions that can be considered as ways to approximate the results that we could have obtained by performing cross-validation
Cross-validation
On average, the accuracy of a model will be higher for the within-sample than for the out-of-sample accuracy. As we need data to fit the model and data to test it, one simple...