In this section, we'll talk about the main paradigms of RL. We first mentioned some of them in Chapter 1, Machine Learning: an Introduction, but it's worth discussing them here to refresh our memory and for the sake of completeness. To help us with this task, we'll use a maze game as an example. The maze is represented by a rectangular grid, where grid cells with a value of 0 represent the walls, and the cells with a value of 1 are the paths. Some locations contain intermediate rewards. An agent in the maze can use the paths to move between locations. Its objective is to navigate its way to the other end of the maze and to get the largest possible reward while doing so. The following is a diagram describing the basic principles of how RL works:
Here are some elements of an RL system:
- Agent: The entity for which we are...