Use the following exercises to improve your understanding of RL and the PPO trainer.
- Convert one of the Unity examples to use just visual observations. Hint, use the GridWorld example as a guide, and remember that the agent may need its own camera.
- Alter the CNN configuration of an agent using visual observations in three different ways. You can add more layers, take them away, or alter the kernel filter. Run the training sessions and compare the differences with TensorBoard.
- Convert the GridWorld sample to use vector observations and recurrent networks with memory. Hint, you can borrow several pieces of code from the Hallway example.
- Revisit the Ball3D example and set it up to use multiple asynchronous agent training.
- Set up the crawler example and run it with multiple asynchronous agent training.
If you encounter problems running through these samples, be sure...