We expanded our knowledge of neural networks by describing the general principles of RNNs. After covering the inner workings of the basic RNN, we extended backpropagation to apply it to recurrent networks. As presented in this chapter, BPTT suffers from gradient vanishing when applied to RNNs. This can be worked around by using truncated backpropagation, or by using a different type of architecture—LSTM networks.
We applied those theoretical principles to a practical problem—action recognition in videos. By combining CNNs and LSTMs, we successfully trained a network to classify videos in 101 categories, introducing video-specific techniques such as frame sampling and padding.
In the next chapter, we will broaden our knowledge of neural network applications by covering new platforms—mobile devices and web browsers.