In this chapter, we discussed how an agent interacts with an environment by taking an action based on the observation it receives from the environment, and the environment responds to the agent's action with an (optional) reward and the next observation.
With a concise understanding of the foundations of reinforcement learning, we went deeper to understand what deep reinforcement learning is, and uncovered the fact that we could use deep neural networks to represent value functions and policies. Although this chapter was a little heavy on notation and definitions, hopefully it laid a strong foundation for us to develop some cool agents in the upcoming chapters. In the next chapter, we will consolidate our learning in the first two chapters and put it to use by laying out the groundwork to train an agent to solve some interesting problems.