You're reading from Machine Learning Engineering with Python Manage the lifecycle of machine learning models using MLOps with practical examples

Product type Paperback

Published in Aug 2023

Publisher Packt

ISBN-13 9781837631964

Length 462 pages

Edition 2nd Edition

Languages

Python

Tools

GitHub

Concepts

Machine Learning

Author (1):

Andrew P. McMahon

View More author details

Table of Contents (12) Chapters

Preface

1. Introduction to ML Engineering

2. The Machine Learning Development Process FREE CHAPTER

3. From Model to Model Factory

4. Packaging Up

5. Deployment Patterns and Tools

6. Scaling Up

7. Deep Learning, Generative AI, and LLMOps

8. Building an Example ML Microservice

9. Building an Extract, Transform, Machine Learning Use Case

10. Other Books You May Enjoy

11. Index

The Machine Learning Development Process

In this chapter, we will define how the work for any successful machine learning (ML) software engineering project can be divided up. Basically, we will answer the question of how you actually organize the doing of a successful ML project. We will not only discuss the process and workflow but we will also set up the tools you will need for each stage of the process and highlight some important best practices with real ML code examples.

In this edition, there will be more details on an important data science and ML project management methodology: Cross-Industry Standard Process for Data Mining (CRISP-DM). This will include a discussion of how this methodology compares to traditional Agile and Waterfall methodologies and will provide some tips and tricks for applying it to your ML projects. There are also far more detailed examples to help you get up and running with continuous integration/continuous deployment (CI/CD) using GitHub Actions, including how to run ML-focused processes such as automated model validation. The advice on getting up and running in an Interactive Development Environment (IDE) has also been made more tool-agnostic, to allow for those using any appropriate IDE. As before, the chapter will focus heavily on a “four-step” methodology I propose that encompasses a discover, play, develop, deploy workflow for your ML projects. This project workflow will be compared with the CRISP-DM methodology, which is very popular in data science circles. We will also discuss the appropriate development tooling and its configuration and integration for a successful project. We will also cover version control strategies and their basic implementation, and setting up CI/CD for your ML project. Then, we will introduce some potential execution environments as the target destinations for your ML solutions. By the end of this chapter, you will be set up for success in your Python ML engineering project. This is the foundation on which we will build everything in subsequent chapters.

As usual, we will conclude the chapter by summarizing the main points and highlighting what this means as we work through the rest of the book.

Finally, it is also important to note that although we will frame the discussion here in terms of ML challenges, most of what you will learn in this chapter can also be applied to other Python software engineering projects. My hope is that the investment in building out these foundational concepts in detail will be something you can leverage again and again in all of your work.

We will explore all of this in the following sections and subsections: