In this chapter, we learned about a hot new area in deep learning known as attention mechanisms. These are used to allow networks to focus on specific parts of input. This helps the network overcome the problem of long-term dependencies. We also learned about how these attention mechanisms can be used instead of sequential models such as RNNs to produce state-of-the-art results on tasks such as machine translation and sentence generation. However, they can also be used to focus on relevant parts of images. This can be used for tasks such as visual question answering, where we may want our network to tell us what is happening in a given scene.
In the next chapter, we will learn about generative models.