The visual state an agent uses in the ML-Agents toolkit is defined by a process that takes a screenshot at a specific resolution and then feeds that into a convolutional network to train some form of embedded state. In the following exercise, we will open up the ML-Agents training code and enhance the convolution code for better input state:
- Use a file browser to open the ML-Agents trainers folder located at ml-agents.6\ml-agents\mlagents\trainers. Inside this folder, you will find several Python files that are used to train the agents. The file we are interested in is called models.py.
- Open the models.py file in your Python editor of choice. Visual Studio with the Python data extensions is an excellent platform, and also provides the ability to interactively debug code.
- Scroll down in the file to locate the create_visual_observation_encoder function...