What this book covers
Chapter 1, The ML Process and Its Challenges, provides an overview of the various data science use cases across different domains. It outlines the different stages and roles involved in an ML project, from data engineering to analysis, feature engineering, and ML model training and deployment.
Chapter 2, Overview of ML on Databricks, guides you through the process of registering for a Databricks trial account and explores the machine learning features specifically designed for an ML practitioner’s workspace.
Chapter 3, Utilizing the Feature Store, introduces you to the concept of a feature store. We will guide you through the process of creating feature tables using Databricks’ offline feature store and demonstrate their effective utilization. Additionally, we’ll discuss the advantages of employing a feature store in your machine learning workflows.
Chapter 4, Understanding MLflow Components on Databricks, helps you understand what MLflow is, its components, and the benefits of using them. We will also walk through how to register a model with the MLflow Model Registry.
Chapter 5, Create a Baseline Model Using Databricks AutoML, covers what AutoML is, why it is important, and Databricks’ approach to AutoML. We will also create a baseline model with AutoML.
Chapter 6, Model Versioning and Webhooks, teaches you how to utilize the MLflow model registry to manage model versioning, transition to PROD from various stages, and use webhooks to set up alerts and monitoring.
Chapter 7, Model Deployment Approaches, covers the different options for deploying an ML model utilizing the Databricks platform.
Chapter 8, Automating ML Workflows Using Databricks Jobs, explains what Databricks jobs are and how they can be used as powerful tools to automate ML workflows. We will go over how to set up an ML training workflow using the Jobs API.
Chapter 9, Model Drift Detection and Retraining, teaches you how to detect and protect against model drift in production environments.
Chapter 10, Using CI/CD to Automate Model Retraining and Redeployment, demonstrates how to set up your Databricks ML development and deployment as a CI/CD pipeline. We will use all the concepts learned about previously in this book.