Throughout this book, we have learned how the various threads in Reinforcement Learning (RL) combined to form modern RL and then advanced to Deep Reinforcement Learning (DRL) with the inclusion of Deep Learning (DL). Like most other specialized fields from this convergence, we now see a divergence back to specialized methods for specific classes of environments. We started to see this in the chapters where we covered Policy Gradient (PG) methods and the environments it specialized on are continuous control. The flip side of this is the more typical episodic game environment, which is episodic with some form of discrete control mechanism. These environments typically perform better with DQN but the problem then becomes about DQN. Well, in this chapter, we will look at how smart people solved that by introducing Rainbow DQN.
In this chapter, we will introduce...