From tabular Q-learning to deep Q-learning
When we covered the tabular Q-learning method in Chapter 5, Solving the Reinforcement Learning Problem, it should have been obvious that we cannot really extend those methods to most real-life scenarios. Think about an RL problem which uses images as input. A image with three 8-bit color channels would lead to possible images, a number that your calculator won't be able to calculate. For this very reason, we need to use function approximators to represent the value function. Given their success in supervised and unsupervised learning, neural networks / deep learning emerges as the clear choice here. On the other hand, as we mentioned in the introduction, the convergence guarantees of tabular Q-learning fall apart when function approximators come in. This section introduces two deep Q-learning algorithms, the Neural Fitted Q-iteration and online Q-learning, and then discusses what does not go so well with them. With that, we set the...