Summary
In this chapter, we explored the topic of neural machine translation and trained a network to produce English-to-German translations.
We started with an introduction to automatic machine translation, covering its history from rule-based machine translation to neural machine translation. Next, we introduced the concept of encoder-decoder RNN-based architectures, which can be used for neural machine translation. In general, encoder-decoder architectures can be used for sequence-to-sequence prediction tasks or question-answer systems.
After that, we covered all the steps needed to train and apply a neural machine translation model at the character level, using a simple network structure with only one LSTM unit for both the encoder and decoder. The joint network, derived from the combination of the encoder and decoder, was trained using a teacher forcing paradigm.
At the end of the training phase and before deployment, a lambda layer was inserted in the decoder part to...