Optimizing Models with Transfer Learning and Fine-Tuning
As models grow in size (the depth and number of processing modules per layer), training them grows exponentially as more time is spent per epoch, and typically, more epochs are required to reach optimum performance.
For this reason, MXNet provides state-of-the-art pre-trained models via GluonCV and GluonNLP libraries. As we have seen in previous chapters, these models can help us solve a variety of problems when our final dataset is similar to the one the selected model has been pre-trained on.
However, sometimes this is not good enough, and our final dataset might have some nuances that the pre-trained model is not picking up. In these cases, it would be ideal to combine the stored knowledge of the pre-trained model with our final dataset. This is called transfer learning, where the knowledge of our pre-trained model is transferred to a new task (final dataset).
In this chapter, we will learn how to use GluonCV and...