We are going to build a neural machine translation system that will learn to translate short English sentences into French. To do this, we are going to use the English-to-French text corpus (fra-eng/fra.txt) located at http://www.manythings.org/anki/.
Implementing a sequence-to-sequence neural translation machine
Processing the input data
Text data cannot be fed directly into any neural network, since neural networks can understand only numbers. We will treat each word as a one-hot encoded vector of a length that is equal to the number of words present in each corpus. If the English corpus contains 1,000 words, the one-hot encoded vectors ve would be of a dimension of 1,000, that is, ve ∈ R1000 x 1.
We will read through...