Chapter 3: Learning about Machine Learning and Graph Processing in Databricks
Databricks is ideal for productionalizing data science projects. It provides specialized runtimes for machine learning (ML) and integration with MLflow. MLflow is an open source project that helps to manage an end-to-end (E2E) ML life cycle. Databricks provides a managed version of MLflow as part of its complete offering.
Graph processing is yet another offering in Databricks. This is made available by GraphFrames, a Spark package that makes graph analysis accessible using DataFrames. We will look at examples for both ML and graph processing in this chapter.
The following topics are covered in this chapter:
- Learning about ML components in Databricks
- Practicing ML in Databricks
- Learning about MLflow
- Learning about graph analysis in Databricks