Summary
Many researchers believe that RL is the best shot we have of creating artificial general intelligence. It is an exciting field, with many unsolved challenges and huge potential. Although it can appear challenging at first, getting started in RL is actually not so difficult. In this chapter, we have described some basic principles of RL.
The main thing we have discussed is the Q-Learning algorithm. Its distinctive feature is the capacity to choose between immediate rewards and delayed rewards. Q-learning at its simplest uses tables to store data. This very quickly loses viability when the size of the state/action space of the system it is monitoring/controlling increases.
We can overcome this problem using a neural network as a function approximator that takes the state and action as input and outputs the corresponding Q-value.
Following this idea, we implemented a Q-learning neural network using the TensorFlow framework and the OpenAI Gym toolkit to win at the FrozenLake game.
In the...