A note on backtesting
The peculiarities of choosing training and testing sets are especially important in both systematic investing and algorithmic trading. The main way to test trading algorithms is a process called backtesting.
Backtesting means we train the algorithm on data from a certain time period and then test its performance on older data. For example, we could train on data from a date range of 2015 to 2018 and then test on data from 1990 to 2015. By doing this, not only is the model's accuracy tested, but the backtested algorithm executes virtual trades so its profitability can be evaluated. Backtesting is done because there is plenty of past data available.
With all that being said, backtesting does suffer from several biases. Let's take a look at four of the most important biases that we need to be aware of:
Look-ahead bias: This is introduced if future data is accidentally included at a point in the simulation where that data would not have been available yet. This can be caused...