Preparing time series data for supervised learning
In supervised ML, you must specify the independent variables (predictor variables) and the dependent variable (target variable). For example, in scikit-learn, you will use the fit(X, y)
method for fitting a model, where X refers to the independent variable and y to the target variable.
Generally, preparing the time series data is similar to what you have done in previous chapters. However, additional steps will be specific to supervised ML, which is what this recipe is about. The following highlights the overall steps:
- Inspect your time series data to ensure there are no significant gaps, such as missing data, in your time series. If there are gaps, evaluate the impact and consider some of the imputation and interpolation techniques discussed in Chapter 7, Handling Missing Data.
- Understand any stationarity assumptions in the algorithm before fitting the model. If stationarity is an assumption before training, then transform...