RNNs are the primary means by which we reason over textual inputs, and come in a variety of forms. In this chapter, we learned about the recurrent structure of RNNs, and special versions of RNNs that utilize memory cells. RNNs are used for any type of sequence prediction, generating music, text, image captions, and more.
RNNs are different from feedforward networks in that they have recurrence; each step in the RNN is dependent on the network's memory at the last state, along with its own weights and bias factors. As a result, vanilla RNNs struggle with long-term dependencies; they find it difficult to remember sequences beyond a specific number of time steps back. GRU and LSTM utilize memory gating mechanisms to control what they remember and forget, and hence, overcome the problem of dealing with long-term dependencies that many RNNs run into. RNN/CNN hybrids with...