Preparing data for the NMT system
In this section, we will talk about the exact process for preparing data for training and predicting from the NMT system. First, we talk will about how to prepare training data (that is, the source sentence and target sentence pairs) to train the NMT system followed by inputting a given source sentence to produce the translation of the source sentence.
At training time
The training data consists of pairs of source sentences and corresponding translations to the target language. An example might look like this:
( Ich ging nach Hause , I went home)
( Sie hat in der Schule gewartet , She was waiting at school)
We have N such pairs in our dataset. If we are to implement a fairly good translator, N needs to be in the scale of millions. An increase of training data as such, also implies prolonged training times.
Next, we will introduce two special tokens: <s> and </s>. The <s> token represents the start of a sentence, whereas </s> represents...