Inference with NMT
Inferencing is slightly different from the training process for NMT (Figure 9.17). As we do not have a target sentence at the inference time, we need a way to trigger the decoder at the end of the encoding phase. It’s not difficult as we have already done the groundwork for this in the data we have. We simply kick off the decoder by using <s>
as the first input to the decoder. Then we recursively call the decoder using the predicted word as the input for the next timestep. We continue this way until the model:
- Outputs
</s>
as the predicted token or - Reaches a pre-defined sentence length
To do this, we have to define a new model using the existing weights of the training model. This is because our trained model is designed to consume a sequence of decoder inputs at once. We need a mechanism to recursively call the decoder. Here’s how we can define the inference model:
- Define an encoder model that outputs...