Training Chatbots with RL
In this chapter, we will take a look at another practical application of deep reinforcement learning (RL), which has become popular over the past several years: the training of natural language models with RL methods. It started with a paper called Recurrent Models of Visual Attention (https://arxiv.org/abs/1406.6247), which was published in 2014, and has been successfully applied to a wide variety of problems from the natural language processing (NLP) domain.
In this chapter, we will:
- Begin with a brief introduction to the NLP basics, including recurrent neural networks (RNNs), word embedding, and the seq2seq (sequence-to-sequence) model
- Discuss similarities between NLP and RL problems
- Take a look at original ideas on how to improve NLP seq2seq training using RL methods
The core of the chapter is a dialogue system trained on a movie dialogues dataset: the Cornell Movie-Dialogs Corpus.