RL is a sub-field of machine learning where the learning is carried out by a trial-and-error approach. This differs from other machine learning strategies, such as the following:
- Supervised learning: Where the goal is to learn to fit a model distribution that captures a given labeled data distribution
- Unsupervised learning: Where the goal is to find inherent patterns in a given dataset, such as clustering
RL is a powerful learning approach, since you do not require labeled data, provided, of course, that you can master the learning-by-exploration approach used in RL.
While RL has been around for over three decades, the field has gained a new resurgence in recent years with the successful demonstration of the use of deep learning in RL to solve real-world tasks, wherein deep neural networks are used to make decisions. The coupling of RL with deep learning is typically referred to as deep RL, and is the main topic of discussion of this book.
Deep RL has been successfully applied by researchers to play video games, to drive cars autonomously, for industrial robots to pick up objects, for traders to make portfolio bets, by healthcare practitioners, and copious other examples. Recently, Google DeepMind built AlphaGo, a RL-based system that was able to play the game Go, and beat the champions of the game easily. OpenAI built another system to beat humans in the Dota video game. These examples demonstrate the real-world applications of RL. It is widely believed that this field has a very promising future, since you can train neural networks to make predictions without providing labeled data.
Now, let's delve into the formulation of the RL problem. We will compare how RL is similar in spirit to a child learning to walk.