Let's first build a simple neural network agent to play the game Pac-Man. We will create an agent with a set of random weights and biases. These agents will then try to play the game; we select the agent that is able to play for the longest average duration, assuming theirs to be the best policy.
Implementing neural network agent to play Pac-Man
Getting ready
The agents in this recipe are not learning any policy; they make their decision based on their initial set of weights (fixed policy). The agent picks the actions based on the probability given by the neural network. The decision that each agent makes is based only on the present observation of the environment.
We implement this by a fully connected neural network...