Further reading
For more information, refer to the following papers:
- A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning by Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell, https://arxiv.org/pdf/1011.0686.pdf
- Deep Q-learning from Demonstrations by Todd Hester, et al., https://arxiv.org/pdf/1704.03732.pdf
- Maximum Entropy Inverse Reinforcement Learning by Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, Anind K. Dey, https://www.aaai.org/Papers/AAAI/2008/AAAI08-227.pdf
- Generative Adversarial Imitation Learning by Jonathan Ho, Stefano Ermon, https://arxiv.org/pdf/1606.03476.pdf