Deep Q-network for tigers
Here, we will apply the DQN model to the tiger group of agents to check whether they can learn how to hunt better. All of the agents share the network, so their behavior will be the same. The deer group will keep random behavior in this example to keep things simple for now; we’ll train them later in the chapter.
The training code can be found in Chapter22/forest_tigers_dqn.py; it doesn’t differ much from the other DQN versions from the previous chapters.
Understanding the code
To make the MAgent environment work with our classes, a specialized version of ExperienceSourceFirstLast was implemented to handle the specifics of the environment. This class is called MAgentExperienceSourceFirstLast and can be found in lib/data.py. Let’s check it out to understand how it fits into the rest of the code.
The first class we define is the items produced by our ExperienceSource. As we discussed in Chapter&...