The question list is as follows:
- What is DQN?
- What is the need for experience replay?
- Why do we keep a separate target network?
- Why is DQN overestimating?
- How does double DQN avoid overestimating the Q value?
- How are experiences prioritized in prioritized experience replay?
- What is the need for duel architecture?