Setting up a test harness
Before we start forecasting and setting up baselines, we need to set up a test harness. In software testing, a test harness is a collection of code and the inputs that have been configured to test a program under various situations. In terms of machine learning, a test harness is a set of code and data that can be used to evaluate algorithms. It is important to set up a test harness so that we can evaluate all future algorithms in a standard and quick way.
The first thing we need is holdout (test) and validation datasets.
Creating holdout (test) and validation datasets
As a standard practice, in machine learning, we set aside two parts of the dataset, name them validation data and test data, and don’t use them at all to train the model. The validation data is used in the modeling process to assess the quality of the model. To select between different model classes, tune the hyperparameters, perform feature selection, and so on, we need a dataset...