Summary
A GRU is an extension of a simple RNN, which helps to combat the vanishing gradient problem by allowing the model to learn long-term dependencies in the text structure. A variety of use cases can benefit from this architectural unit. We discussed a sentiment classification problem and learned how GRUs perform better than simple RNNs. We then saw how text can be generated using GRUs.
In the next chapter, we talk about another advancement over a simple RNN – Long Short-Term Memory (LSTM) networks, and explore what advantages they bring with their new architecture.