Training an NMT jointly with word embeddings
Here we will discuss how we can train an NMT jointly with word embeddings. We will be covering two concepts in this section:
Training an NMT jointly with a word embedding layer
Using pretrained embeddings instead of randomly initializing the embeddings layer
There are several multilingual word embedding repositories available:
Facebook's fastText: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
CMU multilingual embeddings: http://www.cs.cmu.edu/~afm/projects/multilingual_embeddings.html
From these, we will use the CMU embeddings (~200 MB) as it's much smaller compared with fastText (~5 GB). We first need to download the German (multilingual_embeddings.de
) and English (multilingual_embeddings.en
) embeddings. This is available as an exercise in nmt_with_pretrained_wordvecs.ipynb
in the ch10
folder.
Maximizing matchings between the dataset vocabulary and the pretrained embeddings
We will first have to get a subset of the...