6. LSTMs, GRUs, and Advanced RNNs
Overview
In this chapter, we will study and implement advanced models and variations of the plain Recurrent Neural Network (RNN) that overcome some of RNNs' practical drawbacks and are among the best performing deep learning models at the moment. We will start by understanding the drawbacks of plain RNNs and see how the novel idea of Long Short-Term Memory overcomes them. We will then see and implement a Gated Recurrent Unit based model. We will also work with bidirectional and stacked RNNs and explore attention-based models. By the end of this chapter, you will have built and assessed the performance of these models on a sentiment classification task, observing for yourself the trade-offs in choosing the different models.