LSTM variants and convolutions for text
RNNs are extremely useful when it comes to handling sequential datasets. We saw in the previous section how a simple model effectively learned to generate text based on what it learned from the training dataset.
Over the years, there have been a number of enhancements in the way we model and use RNNs. In this section, we will discuss two widely used variants of the single-layer LSTM network we discussed in the previous section: stacked and bidirectional LSTMs.
Stacked LSTMs
We are well aware of how the depth of a neural network helps it learn complex and abstract concepts when it comes to computer vision tasks. Along the same lines, a stacked LSTM architecture, which has multiple layers of LSTMs stacked one after the other, has been shown to give considerable improvements. Stacked LSTMs were first presented by Graves et al. in their work Speech Recognition with Deep Recurrent Neural Networks.6 They highlight the fact that depth...