RL is a goal-oriented approach to decision-making. It differs from other paradigms due to its direct interaction with the environment and for its delayed reward mechanism. The combination of RL and deep learning is very useful in problems with high-dimensional state spaces and in problems with perceptual inputs. The concepts of policy and value functions are key as they give an indication about the action to take and the quality of the states of the environment. In RL, the model of the environment is not required, but it can give additional information and, therefore, improve the quality of the policy.
Now that all the key concepts have been introduced, in the following chapters, the focus will be on actual RL algorithms. But first, in the next chapter, you will be given the grounding to develop RL algorithms using OpenAI and TensorFlow.