At the end of the last chapter, we took a peek at what is possible with neural networks using an advanced RL algorithm called PPO. What we didn't cover are the details of how this code worked and what it is capable of. While teaching you about all the details of this model would take a book by itself, we will try and cover the basic features in this chapter. Also, keep in mind that while we will be talking about the Unity-specific training implementation, many of the concepts can be carried over to other deep learning models.
In this chapter, we will look at several concepts that are internal to the learn.py training script using PPO and by exploring the Unity ML-Agents examples. Here is what we will be covering in this chapter:
- Agent training problems
- Convolutional neural networks
- Experience replay
- Partial observability, memory, and recurrent...