Deep Q-networks
DQN is a seminal work by (Mnih et al., 2015) that made deep RL a viable approach to complex sequential control problems. The authors demonstrated that a single DQN architecture can achieve super-human level performance in many Atari games without any feature engineering, which created a lot of excitement regarding the progress of AI. Let's look into what makes DQN so effective compared to the algorithms we mentioned earlier.
Key concepts in deep Q-networks
DQN modifies online Q-learning with two important concepts by using experience replay and a target network, which greatly stabilize the learning. We describe these concepts next.
Experience replay
As mentioned earlier, simply using the experience sampled sequentially from the environment leads to highly correlated gradient steps. DQN, on the other hand, stores those experience tuples, , in a replay buffer (memory), an idea that was introduced back in 1993 (Lin, 1993). During learning, the samples...