In this chapter, we introduced RL. We started with some basic paradigms and then we discussed how to represent RL as a Markov Decision Process. We talked about the core RL approaches – DP, Monte Carlo, and TD. Then, we learned about Sarsa, Q-learning, and value function approximation using neural networks. Finally, we used the OpenAI Gym to teach a simple agent to play the classic cart-pole game.
In the next chapter, we'll try to solve more advanced RL problems, such as Go and Atari games, with the help of some state-of-the-art RL algorithms, such as Monte Carlo Tree Search and Deep Q-learning.