The question list is as follows:
- How does TD learning differ from the Monte Carlo method?
- What exactly is a TD error?
- What is the difference between TD prediction and control?
- How to build an intelligent agent using Q learning?
- What is the difference between Q learning and SARSA?