A supervised learning approach to games
The challenge in reinforcement learning is working out a good target for our network. We saw one approach to this in the last chapter, policy gradients. If we can ever turn a reinforcement learning task into a supervised task problem, it becomes a lot easier. So, if our aim is to build an AI agent that plays computer games, one thing we might try is to look at how humans play and get our agent to learn from them. We can make a recording of an expert human player playing a game, keeping track of both the screen image and the buttons the player is pressing.
As we saw in the chapter on computer vision, deep neural networks can identify patterns from images, so we can train a network that has the screen as input and the buttons the human pressed in each frame as the targets. This is similar to how AlphaGo was pretrained in the last chapter. This was tried on a range of complex 3D games, such as Super Smash Bros and Mario Tennis. Convolutional networks were...