Summary
We saw that using parallel learners to update a shared model produced a great improvement on the learning process. We learned about the reason behind the use of asynchronous methods in deep learning and their different variants, including asynchronous one-step Q-learning, asynchronous one-step SARSA, asynchronous n-step Q-learning, and asynchronous advantage actor-critic. We also learned to implement the A3C algorithm, where we made an agent learn to play the games Breakout and Doom.
In the coming chapters, we will focus on different domains and how deep reinforcement learning is being, and can be, applied.