Introduction
Of all the machine-learning algorithms we have considered thus far, none have considered data as a sequence.To take sequence data into account, we extend neural networks that store outputs from prior iterations. This type of neural network is called a recurrent neural network (RNN).Consider the fully connected network formulation:
data:image/s3,"s3://crabby-images/26dff/26dffc6d8ee961ba00c947a32f46964d8be65664" alt=""
Here, the weights are given by Amultiplied by the input layer, x, and then run through an activation function, , which gives the output layer, y.If we have a sequence of input data,
, we can adapt the fully connected layer to take prior inputs into account, as follows:
data:image/s3,"s3://crabby-images/dd268/dd2688d8a206ec5a7cf6aeed54db5f442ec5c85b" alt=""
On top of this recurrent iteration to get the next input, we want to get the probability distribution output, as follows:
data:image/s3,"s3://crabby-images/ee622/ee622979dfb44047e0c1ae7f063ab2d7826eb6c3" alt=""
Once we have a full sequence output, , we can consider the target a number or category by just considering the last output.See the following figure for how a general architecture might work:
data:image/s3,"s3://crabby-images/cddc2/cddc21fd5088ad337537d8bfee7a52ad1988417e" alt=""
Figure 1: To predict a single number, or a category, we take a sequence of inputs...