You're reading from Practical Machine Learning on Databricks Seamlessly transition ML models and MLOps on Databricks

Product type Paperback

Published in Nov 2023

Publisher Packt

ISBN-13 9781801812030

Length 244 pages

Edition 1st Edition

Languages

Python

Tools

MLOps

Concepts

Data Science

Author (1):

Debu Sinha

View More author details

Table of Contents (16) Chapters

Preface

1. Part 1: Introduction

2. Chapter 1: The ML Process and Its Challenges FREE CHAPTER

3. Chapter 2: Overview of ML on Databricks

4. Part 2: ML Pipeline Components and Implementation

5. Chapter 3: Utilizing the Feature Store

6. Chapter 4: Understanding MLflow Components on Databricks

7. Chapter 5: Create a Baseline Model Using Databricks AutoML

8. Part 3: ML Governance and Deployment

9. Chapter 6: Model Versioning and Webhooks

10. Chapter 7: Model Deployment Approaches

11. Chapter 8: Automating ML Workflows Using Databricks Jobs

12. Chapter 9: Model Drift Detection and Retraining

13. Chapter 10: Using CI/CD to Automate Model Retraining and Redeployment

14. Index

Why subscribe?

15. Other Books You May Enjoy

Discovering the roles associated with machine learning projects in organizations

Typically, three different types of persona are involved in developing an ML solution in an organization:

Data engineers: The data engineers create data pipelines that take in structured, semi-structured, and unstructured data from source systems and ingest them in a data lake. Once the raw data lands in the data lake, the data engineers are also responsible for securely storing the data, ensuring that the data is reliable, clean, and easy to discover and utilize by the users in the organization.
Data scientists: Data scientists collaborate with subject matter experts (SMEs) to understand and address business problems, ensuring a solid business justification for projects. They utilize clean data from data lakes and perform feature engineering, selecting and transforming relevant features. By developing and training multiple ML models with different sets of hyperparameters, data scientists can evaluate them on test sets to identify the best-performing model. Throughout this process, collaboration with SMEs validates the models against business requirements, ensuring their alignment with objectives and key performance indicators (KPIs). This iterative approach helps data scientists select a model that effectively solves the problem and meets the specified KPIs.
Machine learning engineers: The ML engineering teams deploy the ML models created by data scientists into production environments. It is crucial to establish procedures, governance, and access control early on, including defining data scientist access to specific environments and data. ML engineers also implement monitoring systems to track model performance and data drift. They enforce governance practices, track model lineage, and ensure access control for data security and compliance throughout the ML life cycle.

A typical ML project life cycle consists of data engineering, then data science, and lastly, production deployment by the ML engineering team. This is an iterative process.

Now, let’s take a look at the various challenges involved in productionizing ML models.

You're reading from Practical Machine Learning on Databricks Seamlessly transition ML models and MLOps on Databricks

Table of Contents (16) Chapters

Discovering the roles associated with machine learning projects in organizations

Authors (1)

Personalised recommendations for you