Understanding MLflow Components on Databricks
In the previous chapter, we learned about Feature Store, what problem it solves, and how Databricks provides the built-in Feature Store as part of the Databricks machine learning (ML) workspace, which we can use to register our feature tables.
In this chapter, we will look into managing our model training, tracking, and experimentation. In a software engineer’s world, code development and productionization have established best practices; however, such best practices are not generally adopted in the ML engineering/data science world. While working with many Databricks customers, I observed that each data science team has its own way of managing its projects. This is where MLflow comes in.
MLflow is an umbrella project developed at Databricks, by Databricks engineers, to bring a standardized ML life cycle management tool to the Databricks platform. It is now an open source project with more than 500,000 daily downloads on average...