Optimizing training for translating text from English to German
In the first recipe of this chapter, we saw how we could leverage MXNet and Gluon to optimize the training of our models, applying different techniques. We understood how to jointly use lazy evaluation and automatic parallelization for parallel processing and improved the performance of our DataLoaders by combining preprocessing in the CPU and GPU. We saw how using half-precision (Float16
) in combination with AMP can halve our training times, and explored how to take advantage of multiple GPUs for further reduced training times.
Now, we can revisit a problem we have been working with throughout the book, that of translating text from English to German. We have worked with translation tasks in recipes in previous chapters. In the Translating text from Vietnamese to English recipe from Chapter 6, we introduced the task of translating text, while also learning how to use pre-trained models from GluonCV Model Zoo. Furthermore...