Summary
In this chapter, we have come a long way from using tabular Q-learning to implementing a modern, distributed deep Q-learning algorithm. Along the way, we covered the details of neural fitted Q-iteration, online Q-learning, DQN with rainbow improvements, Gorila, and Ape-X DQN algorithms. We also introduced you to Ray and RLlib, which are powerful distributed computing and deep reinforcement learning frameworks.
In the next chapter, we will go into another class of deep Q-learning algorithms: Policy-based methods. Those methods will allow us to directly learn random policies and use continuous actions.