In reinforcement learning, an agent changes its states to maximize its goals. There are four distinct concepts here: agent, state, action, and reward. Let's take a look at these in more detail:
- Agent: This is the program we train. It chooses actions over time from its action space within the environment for a specified task.
- State: This is the observation that's received by the agent from its environment and represents the agent's current situation.
- Action: This is a choice that's made by an agent from its action space. The action changes the state of the agent.
- Reward: This is the resultant feedback regarding the agent's action and describes how the agent ought to behave.
Each of these concepts has been illustrated in the following diagram:

As shown in the preceding diagram, reinforcement learning involves an agent...