Why RL libraries?
Our implementation of basic DQN in Chapter 6 wasn’t very, long and complicated—about 200 lines of training code plus 50 lines in environment wrappers. When you are becoming familiar with RL methods, it is very useful to implement everything yourself to understand how things actually work. However, the more involved you become in the field, the more often you will realize that you are writing the same code over and over again.
This repetition comes from the generality of RL methods. As we discussed in Chapter 1, RL is quite flexible, and many real-life problems fall into the environment-agent interaction scheme. RL methods don’t make many assumptions about the specifics of observations and actions, so code implemented for the CartPole environment will be applicable to Atari games (maybe with some minor tweaks).
Writing the same code over and over again is not very efficient, as bugs might be introduced every time, which...