So far, we have used an MLP to develop a time series forecasting model. To predict the series at time [xt-1, ... , xt-p] we fed an input vector of past time steps to an MLP. The past time steps are fed to the MLP as uncorrelated independent variables. One problem with this kind of model is that it does not implicitly consider the sequential nature of the time series data where observations have correlation with each other. The correlation in a time series can also be interpreted as the memory that the series carries over itself. In this section, we will discuss recurrent neural networks (RNNs) that are architecturally different from MLPs and are more appropriate to fit sequential data.
RNNs emerged as a good choice to develop language models that model the probability of occurrence of a word given the words that appear prior to it. So far, RNNs have...