Summary
In this chapter, we learned about ML and graph analysis in Databricks. We started with differentiating workspace personas for data engineering and ML. Following this, we went through an E2E example of ML, starting with EDA and ending with making predictions with the ML model. Next, we learned about MLflow with a worked-out example. In the later part of the chapter, we had a glimpse of the basic concepts of graph analysis and performed another hands-on tutorial.
Both ML and graph analysis help organizations build better products by solving exciting problems. But the major roadblock here is to practice it with big data. This is where Databricks changes the game completely!
In the next chapter, we will learn how to effectively manage Spark clusters. We will dive deeper into the details of when to use a particular kind of cluster. We will also learn about using Databricks pools, spot instances, and some important components of the Spark user interface (UI).