Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in a given state of an environment to maximize the notion of cumulative reward.
To understand how RL helps, let's consider a simple scenario. Imagine that you are playing chess against a computer (in our case, the computer is an agent that has learned/is learning how to play chess). The setup (rules) of the game constitutes the environment. Furthermore, as we make a move (take an action), the state of the board (the location of various pieces on the chessboard) changes. At the end of the game, depending on the result, the agent gets a reward. The objective of the agent is to maximize the reward.
If the machine (agent1) is playing against a human, the number of games that it can play is finite (depending on the number of games the human can play). This might create a bottleneck for the agent to learn well. However, what...