Summary
The development process of machine learning models can be a complicated task because of the inherent mixed background of the discipline and the fact that it is commonly detached from the common software development lifecycle. Moreover, we will encounter issues when transitioning the models from development to production if we are not able to export the used preprocessing pipeline that was used to extract features of the data.
As we have seen in this chapter, we can tackle issues using MLflow to manage the model lifecycle and apply staging and version control to the models used, and effectively serialize the preprocessing pipeline to be used to preprocess data to be inferred.
In the next chapter, we will explore the concept of distributed learning, a technique in which we can distribute the training process of deep learning models to many workers effectively in Azure Databricks.