Following the amazing results of DQN, many researchers have studied it and come up with integrations and changes to improve its stability, efficiency, and performance. In this section, we will present three of these improved algorithms, explain the idea and solution behind them, and provide their implementation. The first is Double DQN or DDQN, which deals with the over-estimation problem we mentioned in the DQN algorithm. The second is Dueling DQN, which decouples the Q-value function in a state value function and an action-state advantage value function. The third is n-step DQN, an old idea taken from TD algorithms, which spaces the step length between one-step learning and MC learning.
DQN variations
Double DQN
The over...