The anatomy of the agent
As we saw in the previous chapter, there are several entities in RL's view of the world:
- Agent: A person or a thing that takes an active role. In practice, it's some piece of code, which implements some policy. Basically, this policy must decide what action is needed at every time step, given our observations.
- Environment: Some model of the world, which is external to the agent and has the responsibility of providing us with observations and giving us rewards. It changes its state based on our actions.
Let's show how both of them can be implemented in Python for a simplistic situation. We will define an environment that gives the agent random rewards for a limited number of steps, regardless of the agent's actions. This scenario is not very useful, but will allow us to focus on specific methods in both the environment and the agent classes. Let's start with the environment:
class Environment: def __init__(self): self.steps_left...