As we have progressed through this book, we have spent time making sure we can see how our agents our progressing in their respective environments. In this section, we are aiming to add rendering to the agent environment during training using our last DQN example. Then we can see how the agent is actually performing and perhaps try out another couple of new environments along the way.
Adding the ability to watch the agent play in the environment is not that difficult, and we can implement this as we have done with other examples. Open the Chapter_6_DQN_wplay.py code example, and follow the next exercise:
- The code is almost identical to the DQN sample earlier, so we won't need to review the whole code. However, we do want to introduce two new variables as hyperparameters; this will allow us to better control the network training and observer performance: