The word recurrent in the name of this neural network comes from the fact that it has cyclic connections and the same computation is performed on each element of the sequence. This allows it to learn (or memorize) parts of the data to make predictions about the future. An RNN's advantage is that it can scale to much longer sequences than non-sequence based models are able to.
Understanding RNNs
Vanilla RNNs
Without further ado, let's take a look at the most basic version of an RNN, referred to as a vanilla RNN. It looks as follows:
This looks somewhat familiar, doesn't it? It should. If we were to remove the loop, this would be the same as a traditional neural network, but with one hidden layer, which we&apos...