Deep Recurrent Model Architectures
Neural networks are powerful machine learning tools that are used to help us learn complex patterns between the inputs (X) and outputs (y) of a dataset. In Chapter 2, Deep CNN Architectures, we discussed convolutional neural networks, which learn a one-to-one mapping between X and y; that is, each input, X, is independent of the other inputs, and each output, y, is independent of the other outputs of the dataset.
In the previous chapter we combined a CNN model with a recurrent model (LSTM) to build an image caption generator. In this chapter, we will expand on the recurrent model. We will discuss a class of neural networks that can model sequences where X (or y) is not just a single independent data point, but a temporal sequence of data points [X1, X2, .. Xt] (or [y1, y2, .. yt]). Note that X2 (which is the data point at time step 2) is dependent on X1, X3 is dependent on X2 and X1, and so on.
Such networks are classified as Recurrent Neural...