What is MLflow?
Implementing a product based on ML can be a laborious task. There is a general need to reduce the friction between different steps of the ML development life cycle, and between teams of data scientists and engineers that are involved in the process.
ML practitioners, such as data scientists and ML engineers, operate with different systems, standards, and tools. While data scientists spend most of their time developing models in tools such as Jupyter Notebooks, when running in production, the model is deployed in the context of a software application with an environment that is more demanding in terms of scale and reliability.
A common occurrence in ML projects is to have the models reimplemented by an engineering team, creating a custom-made system to serve the specific model. A set of challenges are common with teams that follow bespoke approaches regarding model development:
- ML projects that run over budget due to the need to create bespoke software infrastructure to develop and serve models
- Translation errors when reimplementing the models produced by data scientists
- Scalability issues when serving predictions
- Friction in terms of reproducing training processes between data scientists due to a lack of standard environments
Companies leveraging ML tend to create their own (often extremely laborious) internal systems in order to ensure a smooth and structured process of ML development. Widely documented ML platforms include systems such as Michelangelo and FBLearner, from Uber and Facebook, respectively.
It is in the context of the increasing adoption of ML that MLflow was initially created at Databricks and open sourced as a platform, to aid in the implementation of ML systems.
MLflow enables an everyday practitioner in one platform to manage the ML life cycle, from iteration on model development up to deployment in a reliable and scalable environment that is compatible with modern software system requirements.