Going hardcore: CuLE
During the writing of this chapter, NVIDIA researchers published the paper and code for their latest experiments with porting the Atari emulator on GPU: Steven Dalton, Iuri Frosio, GPU-Accelerated Atari Emulation for Reinforcement Learning, 2019, arXiv:1907.08467. The code of their Atari port is called CuLE (CUDA Learning Environment) and is available on GitHub: https://github.com/NVlabs/cule.
According to their paper, by keeping both the Atari emulator and NN on the GPU, they were able to get Pong solved within one to two minutes and reach FPS of 50k (on the advantage actor-critic (A2C) method, which will be the subject of the next part of the book).
Unfortunately, at the time of writing, the code wasn't stable enough. I failed to make it work on my hardware, but I hope that when you read this, the situation will have already changed. In any case, this project shows a somewhat extreme, but very efficient, way to increase RL methods' performance...