Chapter 4: Deep Recurrent Model Architectures
Neural networks are powerful machine learning tools that are used to help us learn complex patterns between the inputs (X) and outputs (y) of a dataset. In the previous chapter, we discussed convolutional neural networks, which learn a one-to-one mapping between X and y; that is, each input, X, is independent of the other inputs and each output, y, is independent of the other outputs of the dataset.
In this chapter, we will discuss a class of neural networks that can model sequences where X (or y) is not just a single independent data point, but a temporal sequence of data points [X1, X2, .. Xt] (or [y1, y2, .. yt]). Note that X2 (which is the data point at time step 2) is dependent on X1, X3 is dependent on X2 and X1, and so on.
Such networks are classified as recurrent neural networks (RNNs). These networks are capable of modeling the temporal aspect of data by including additional weights in the model that create cycles in the...