Exploring time series forecasting techniques
Within the data science domain, doing time series forecasting first means extending a KPI (or any measure of interest) into the future in the most accurate and least biased way possible. And while this remains the primary goal of forecasting, often, the activity does not boil down to just that as it’s sometimes necessary to include an assessment of the uncertainty of forecasted values and comparisons with previous forecasting benchmarks. The approaches to time series forecasting are essentially two, listed as follows:
- Qualitative forecasting methods are adopted when historical data is not available (for example, when estimating the revenues of a new company that clearly doesn’t have any data available). They are highly subjective methods. Among the most important qualitative forecasting techniques, it is possible to mention the Delphi method.
- Quantitative forecasting techniques are based on historical quantitative data; the analyst/data scientist, starting from this data, tries to understand the underlying structure of the phenomenon of interest and then uses the same data for forecasting purposes. Therefore, the analyst’s task is to identify, isolate, and measure these temporal dynamics behind a time series of past data in order to make optimal predictions and eventually support decisions, planning, and business control. The quantitative approach to forecasting is certainly the most widely used, as it generates results that are typically more robust and more easily deployed into business processes. Therefore, from now on (including in the next chapters), we will focus exclusively on it.
In the following section, we will explore the details of quantitative forecasting, focusing on the basic requirements for carrying it out properly and the main quantitative techniques used in recent years.
Quantitative forecasting properties and techniques
First and foremost, the development of a quantitative forecasting model depends on the available data, both in terms of the amount of data and the quality of historical information. In general, we can say that there are two basic requirements for effectively creating a reliable quantitative forecasting model:
- Obtain an adequate number of observations, which means a sufficient depth of historical data, in order to correctly understand the phenomenon under analysis, estimate the models, and then apply the predictions. Probably one of the most common questions asked by those who are facing the development of a forecasting model for the first time is how long does the Time Series need to be to obtain a reliable model, which, in simple terms, means how much past do I need? The answer is not simple. It would be incorrect to say at least 50 observations are needed or that the depth should be at least 5 years. In fact, the amount of data points to consider depends on the following:
- The complexity of the model to be developed and the number of parameters to be estimated.
- The amount of randomness in the data.
- The granularity of the data (such as monthly, daily, and hourly) and its characteristics. (Is it intermittent? Are there strong periods of discontinuity to consider?)
- The presence of one or more seasonal components that need to be estimated in relation to the granularity of the data (for example, to include a weekly seasonality pattern of hourly data in the model, at least several hundred observations must be available).
- Collect information about the «time dimension» of the time series in order to determine the starting/ending points of the data and a possible length for the seasonal components (if present).
Given a set of sufficient historical data, the basis for a quantitative forecasting model is the assumption that there are factors that influenced the dynamics of the series in the past and these factors continue to bring similar effects in the future, too.
There are several criteria used to classify quantitative forecasting techniques. It is possible to consider the historical evolution of the methods (from the most classical to the most modern), how the methods use the information within the model, or even the domain of method development (purely statistical versus ML). Here, we present one possible classification of the techniques used for quantitative forecasting, which takes into account multiple relevant elements that characterize the different methods. We can consider these three main groups of methods as follows:
- Classical univariate forecasting methods: In these statistical techniques, the formation of forecasts is only based on the same time series to be forecast through the identification of structural components, such as trends and seasonality, and the study of the serial correlation. Some popular methods in this group are listed as follows:
- Classical decomposition: This considers the observed series as the overlap of three elementary components (trend-cycle, seasonality, and residual), connected with different patterns that are typically present in many economics time series; classical decomposition (such as other types of decomposition) is a common way to explore and interpret the characteristics of a time series, but it can certainly be used to produce forecasts. In Chapter 5, Time Series Components and Statistical Properties, we will delve deeper into this method.
- Exponential smoothing: Forecasts produced by exponential smoothing methods are based on weighted averages of past observations, with weights decaying exponentially as the observations get older; this decreasing weights method could also take into account the overlap of some components, such as trends and seasonality.
- AutoRegressive Integrated Moving Average (ARIMA): Essentially, this is a regression-like approach that aims to model, as effectively as possible, the serial correlation among the observations in a time series. To do this effectively, several parameters in the model can handle trends and seasonality, although less directly than decomposition or exponential smoothing.
- Explanatory models: These techniques work in a multivariate fashion, so the forecasts are based on both past observations of the reference time series and external predictors, which helps to achieve better accuracy but also to obtain a more extensive interpretation of the model. The most popular example in this group is the ARIMAX model (or regression with ARIMA errors).
- ML methods: These techniques can be either univariate or multivariate. However, their most distinctive feature is that they originated outside the statistical domain and were not specifically designed to analyze time series data; typically, they are artificial neural networks (such as multilayer perceptron, long-short memory networks, and dilated convolutional neural networks) or tree-based algorithms (such as random forest or gradient boosted trees) originally made for cross-sectional data that can be adapted for time series forecasting.
A very common question asked by students and practitioners who are new to TSA is whether there is one forecasting method that is better than the others. The answer (for now) is no. All of the models have their own pros and cons. In general, exponential smoothing, ARIMA, and all the classical methodologies have been around the longest. They are quite easy to implement and typically very reliable, but they require the verification of some assumptions, and sometimes, they are not as flexible as you would like them to be. In contrast, ML algorithms are really flexible (they don’t have assumptions to check), but commonly, you need a large amount of data to train them properly. Moreover, they can be more complicated (a lot of hyperparameters to tune), and to be effective, you need to create some extra-temporal features to catch the time-related patterns within your data.
But what does the best forecasting model mean? Consider that it’s never just a matter of the pure performance of the model, as you need to consider other important items in the model selection procedure. For instance, consider the following list of items:
- Forecast horizon in relation to TSA objectives: Are you going to predict the short term or the long term? For the same time series, you could have a model that is the best one for short-term forecasts, but you need to use another one for long-term forecasts.
- The type/amount of available data: In general, for small datasets, a classical forecasting method could be better than an ML approach.
- The required readability of the results: A classical model is more interpretable than an ML model.
- The number of series to forecast: Using classical methods with thousands of time series can be inefficient, so in this case, an ML approach could be better.
- Deployment-related issues: Also, consider the frequency of the delivery of the forecasts, the software environment, and the usage of the forecasts.
In summary, when facing the modeling part of your time series forecasting application, don’t just go with one algorithm. Try different approaches, considering your goals and the type/amount of data that you have.