Challenges in DQN
Everything that was explained in the preceding sections looks good; however, there are a few challenges with DQNs. Here are a couple of the challenges of a DQN:
- The correlation between the steps causes a convergence issue during the training process
- The challenge of having a non-stationary target.
These challenges and their corresponding solutions are explained in the following sections.
Correlation between Steps and the Convergence Issue
From the previous exercise, we have seen that, during Q learning, we treat the RL problem as a supervised machine learning problem, where we have predictions and target values, and, using gradient descent optimization, we try to reduce the loss to find the optimal Q function.
The gradient descent algorithm assumes that the training data points are independent and identically distributed (that is, i.i.d
), which is generally true in the case of traditional machine learning data. However, in the case...