Improving Inference Performance with MXNet
In previous chapters, we leveraged MXNet’s capabilities to solve computer vision and natural language processing tasks. In those chapters, the focus was on obtaining the maximum performance out of pre-trained models, leveraging the Model Zoo API from GluonCV and GluonNLP. We trained these models using different approaches from scratch, including transfer learning and fine-tuning. In the previous chapter, we explored how some advanced techniques can be leveraged to optimize the training process. Finally, in this chapter, we will focus on improving the performance of the inference process itself, accelerating how we can obtain results from our models with several topics related to edge AI computing.
To achieve the objective of optimizing the performance of our inference pipeline, MXNet contains different features. We have already briefly discussed some of those features, such as the concept of Automatic Mixed Precision (AMP), which...