Ways to Speed Up RL
In Chapter 8, you saw several practical tricks to make the deep Q-network (DQN) method more stable and converge faster. They involved basic DQN method modifications (like injecting noise into the network or unrolling the Bellman equation) to get a better policy, with less time spent on training. But in this chapter, we will explore another way to do this: tweaking the implementation details of the method to improve the speed of the training. This is a pure engineering approach, but it’s also important since it is useful in practice.
In this chapter, we will:
-
Take the Pong environment from the previous chapter and try to get it solved as fast as possible
-
In a step-by-step manner, get Pong solved almost 2 times faster using exactly the same commodity hardware