Optimizing inference when translating text from English to German
In the initial recipe, we saw how we can leverage MXNet and Gluon to optimize the inference of our models, applying different techniques: improving the runtime performance using hybridization; how using half-precision (float16) in combination with AMP can strongly reduce our inference times; and how to take advantage of further optimizations with data types such as Int8 quantization.
Now, we can revisit a problem we have been working with throughout the book: translating English to German. We have worked with translation tasks in recipes from previous chapters. In Recipe 4, Translating text from Vietnamese to English, from Chapter 6, Understanding Text with Natural Language Processing, we introduced the task of translating text, while also learning how to use pre-trained models from GluonCV Model Zoo.
Furthermore, in Recipe 4, Improving performance for translating English to German, from Chapter 7, Optimizing Models...