Elements of a reinforcement learning system
RL problems feature several elements that set them apart from the ML settings we have covered so far. The following two sections outline the key features required for defining and solving an RL problem by learning a policy that automates decisions. We'll use the notation and generally follow Reinforcement Learning: An Introduction (Sutton and Barto 2018) and David Silver's UCL Courses on RL (https://www.davidsilver.uk/teaching/), which are recommended for further study beyond the brief summary that the scope of this chapter permits.
RL problems aim to solve for actions that optimize the agent's objective, given some observations about the environment. The environment presents information about its state to the agent, assigns rewards for actions, and transitions the agent to new states, subject to probability distributions the agent may or may not know. It may be fully or partially observable, and it may also contain...