Of all the machine learning algorithms we have considered thus far, none have considered data as a sequence. To take sequence data into account, we extend neural networks that store outputs from prior iterations. This type of neural network is called an RNN. Consider the fully connected network formulation:
![](https://static.packt-cdn.com/products/9781789131680/graphics/assets/ab01cacf-e47e-4b82-90eb-09d12f96d06c.png)
Here, the weights are given by A multiplied by the input layer, x, and then run through an activation function, , which gives the output layer, y.
If we have a sequence of input data, , we can adapt the fully connected layer to take prior inputs into account, as follows:
![](https://static.packt-cdn.com/products/9781789131680/graphics/assets/fe9f65c6-64fa-4fcc-854b-c1fae403ead8.png)
On top of this recurrent iteration to get the next input, we want to get the probability distribution output, as follows:
![](https://static.packt-cdn.com/products/9781789131680/graphics/assets/e5a295eb-90c3-410e-877a-dc830cac4504.png)
Once we have a full sequence output, , we can consider the target as a number or category by just considering the last output. See the following diagram for how a general...