RNN cell variants
In this section we'll look at some cell variants of RNNs. We'll begin by looking at a variant of the SimpleRNN cell: the Long short-term memory RNN.
Long short-term memory (LSTM)
The LSTM is a variant of the SimpleRNN cell that is capable of learning long-term dependencies. LSTMs were first proposed by Hochreiter and SchmidHuber [14] and refined by many other researchers. They work well on a large variety of problems and are the most widely used RNN variant.
We have seen how the SimpleRNN combines the hidden state from the previous time step and the current input through a tanh layer to implement recurrence. LSTMs also implement recurrence in a similar way, but instead of a single tanh
layer, there are four layers interacting in a very specific way. The following diagram illustrates the transformations that are applied in the hidden state at time step t.
The diagram looks complicated, but let us look at it component by component. The line across...