Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Machine Learning Engineering  with Python
Machine Learning Engineering  with Python

Machine Learning Engineering with Python: Manage the lifecycle of machine learning models using MLOps with practical examples , Second Edition

Arrow left icon
Profile Icon Andrew P. McMahon
Arrow right icon
Mex$737.99 Mex$820.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (38 Ratings)
eBook Aug 2023 462 pages 2nd Edition
eBook
Mex$737.99 Mex$820.99
Paperback
Mex$1025.99
Subscription
Free Trial
Arrow left icon
Profile Icon Andrew P. McMahon
Arrow right icon
Mex$737.99 Mex$820.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6 (38 Ratings)
eBook Aug 2023 462 pages 2nd Edition
eBook
Mex$737.99 Mex$820.99
Paperback
Mex$1025.99
Subscription
Free Trial
eBook
Mex$737.99 Mex$820.99
Paperback
Mex$1025.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Machine Learning Engineering with Python

The Machine Learning Development Process

In this chapter, we will define how the work for any successful machine learning (ML) software engineering project can be divided up. Basically, we will answer the question of how you actually organize the doing of a successful ML project. We will not only discuss the process and workflow but we will also set up the tools you will need for each stage of the process and highlight some important best practices with real ML code examples.

In this edition, there will be more details on an important data science and ML project management methodology: Cross-Industry Standard Process for Data Mining (CRISP-DM). This will include a discussion of how this methodology compares to traditional Agile and Waterfall methodologies and will provide some tips and tricks for applying it to your ML projects. There are also far more detailed examples to help you get up and running with continuous integration/continuous deployment (CI/CD) using GitHub Actions, including how to run ML-focused processes such as automated model validation. The advice on getting up and running in an Interactive Development Environment (IDE) has also been made more tool-agnostic, to allow for those using any appropriate IDE. As before, the chapter will focus heavily on a “four-step” methodology I propose that encompasses a discover, play, develop, deploy workflow for your ML projects. This project workflow will be compared with the CRISP-DM methodology, which is very popular in data science circles. We will also discuss the appropriate development tooling and its configuration and integration for a successful project. We will also cover version control strategies and their basic implementation, and setting up CI/CD for your ML project. Then, we will introduce some potential execution environments as the target destinations for your ML solutions. By the end of this chapter, you will be set up for success in your Python ML engineering project. This is the foundation on which we will build everything in subsequent chapters.

As usual, we will conclude the chapter by summarizing the main points and highlighting what this means as we work through the rest of the book.

Finally, it is also important to note that although we will frame the discussion here in terms of ML challenges, most of what you will learn in this chapter can also be applied to other Python software engineering projects. My hope is that the investment in building out these foundational concepts in detail will be something you can leverage again and again in all of your work.

We will explore all of this in the following sections and subsections:

  • Setting up our tools
  • Concept to solution in four steps:
    • Discover
    • Play
    • Develop
    • Deploy

There is plenty of exciting stuff to get through and lots to learn – so let’s get started!

Technical requirements

As in Chapter 1, Introduction to ML Engineering if you want to run the examples provided here, you can create a Conda environment using the environment YAML file provided in the Chapter02 folder of the book’s GitHub repository:

conda env create –f mlewp-chapter02.yml

On top of this, many of the examples in this chapter will require the use of the following software and packages. These will also stand you in good stead for following the examples in the rest of the book:

  • Anaconda
  • PyCharm Community Edition, VS Code, or another Python-compatible IDE
  • Git

You will also need the following:

  • An Atlassian Jira account. We will discuss this more later in the chapter, but you can sign up for one for free at https://www.atlassian.com/software/jira/free.
  • An AWS account. This will also be covered in the chapter, but you can sign up for an account at https://aws.amazon.com/. You will need to add payment details to sign up for AWS, but everything we do in this book will only require the free tier solutions.

The technical steps in this chapter were all tested on both a Linux machine running Ubuntu 22.04 LTS with a user profile that had admin rights and on a Macbook Pro M2 with the setup described in Chapter 1, Introduction to ML Engineering. If you are running the steps on a different system, then you may have to consult the documentation for that specific tool if the steps do not work as planned. Even if this is the case, most of the steps will be the same, or very similar, for most systems. You can also check out all of the code for this chapter in the book’s repository at https://github.com/PacktPublishing/Machine-Learning-Engineering-with-Python-Second-Edition/tree/main/Chapter02. The repo will also contain further resources for getting the code examples up and running.

Setting up our tools

To prepare for the work in the rest of this chapter, and indeed the rest of the book, it will be helpful to set up some tools. At a high level, we need the following:

  • Somewhere to code
  • Something to track our code changes
  • Something to help manage our tasks
  • Somewhere to provision infrastructure and deploy our solution

Let’s look at how to approach each of these in turn:

  • Somewhere to code: First, although the weapon of choice for coding by data scientists is of course Jupyter Notebook, once you begin to make the move toward ML engineering, it will be important to have an IDE to hand. An IDE is basically an application that comes with a series of built-in tools and capabilities to help you to develop the best software that you can. PyCharm is an excellent example for Python developers and comes with a wide variety of plugins, add-ons, and integrations useful to ML engineers. You can download the Community Edition from JetBrains at https://www.jetbrains.com/pycharm/. Another popular development tool is the lightweight but powerful source code editor VS Code. Once you have successfully installed PyCharm, you can create a new project or open an existing one from the Welcome to PyCharm window, as shown in Figure 2.1:
    Figure 2.1 – Opening or creating your PyCharm project

    Figure 2.1: Opening or creating your PyCharm project.

  • Something to track code changes: Next on the list is a code version control system. In this book, we will use GitHub but there are a variety of solutions, all freely available, that are based on the same underlying open-source Git technology. Later sections will discuss how to use these as part of your development workflow, but first, if you do not have a version control system set up, you can navigate to github.com and create a free account. Follow the instructions on the site to create your first repository, and you will be shown a screen that looks something like Figure 2.2. To make your life easier later, you should select Add a README file and Add .gitignore (then select Python). The README file provides an initial Markdown file for you to get started with and somewhere to describe your project. The .gitignore file tells your Git distribution to ignore certain types of files that in general are not important for version control. It is up to you whether you want the repository to be public or private and what license you wish to use. The repository for this book uses the MIT license:
    Figure 2.2 – Setting up your GitHub repository

    Figure 2.2: Setting up your GitHub repository.

    Once you have set up your IDE and version control system, you need to make them talk to each other by using the Git plugins provided with PyCharm. This is as simple as navigating to VCS | Enable Version Control Integration and selecting Git. You can edit the version control settings by navigating to File | Settings | Version Control; see Figure 2.3:

    Figure 2.3 – Configuring version control with PyCharm

    Figure 2.3: Configuring version control with PyCharm.

  • Something to help manage our tasks: You are now ready to write Python and track your code changes, but are you ready to manage or participate in a complex project with other team members? For this, it is often useful to have a solution where you can track tasks, issues, bugs, user stories, and other documentation and items of work. It also helps if this has good integration points with the other tools you will use. In this book, we will use Jira as an example of this. If you navigate to https://www.atlassian.com/software/jira, you can create a free cloud Jira account and then follow the interactive tutorial within the solution to set up your first board and create some tasks. Figure 2.4 shows the task board for this book project, called Machine Learning Engineering in Python (MEIP):

    Figure 2.4: The task board for this book in Jira.

  • Somewhere to provision infrastructure and deploy our solution: Everything that you have just installed and set up is tooling that will really help take your workflow and software development practices to the next level. The last piece of the puzzle is having the tools, technologies, and infrastructure available for deploying the end solution. The management of computing infrastructure for applications was (and often still is) the provision of dedicated infrastructure teams, but with the advent of public clouds, there has been real democratization of this capability for people working across the spectrum of software roles. In particular, modern ML engineering is very dependent on the successful implementation of cloud technologies, usually through the main public cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP). This book will utilize tools found in the AWS ecosystem, but all of the tools and techniques you will find here have equivalents in the other clouds.

The flip side of the democratization of capabilities that the cloud brings is that teams who own the deployment of their solutions have to gain new skills and understanding. I am a strong believer in the principle that “you build it, you own it, you run it” as far as possible, but this means that as an ML engineer, you will have to be comfortable with a host of potential new tools and principles, as well as owning the performance of your deployed solution. With great power comes great responsibility and all that. In Chapter 5, Deployment Patterns and Tools, we will dive into this topic in detail.

Let’s talk through setting this up.

Setting up an AWS account

As previously stated, you don’t have to use AWS, but that’s what we’re going to use throughout this book. Once it’s set up here, you can use it for everything we’ll do:

  1. To set up an AWS account, navigate to aws.amazon.com and select Create Account. You will have to add some payment details but everything we mention in this book can be explored through the free tier of AWS, where you do not incur a cost below a certain threshold of consumption.
  2. Once you have created your account, you can navigate to the AWS Management Console, where you can see all the services that are available to you (see Figure 2.5):
Figure 2.5 – The AWS Management Console

Figure 2.5: The AWS Management Console.

With our AWS account ready to go, let’s look at the four steps that cover the whole process.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • This second edition delves deeper into key machine learning topics, CI/CD, and system design
  • Explore core MLOps practices, such as model management and performance monitoring
  • Build end-to-end examples of deployable ML microservices and pipelines using AWS and open-source tools

Description

The Second Edition of Machine Learning Engineering with Python is the practical guide that MLOps and ML engineers need to build solutions to real-world problems. It will provide you with the skills you need to stay ahead in this rapidly evolving field. The book takes an examples-based approach to help you develop your skills and covers the technical concepts, implementation patterns, and development methodologies you need. You'll explore the key steps of the ML development lifecycle and create your own standardized "model factory" for training and retraining of models. You'll learn to employ concepts like CI/CD and how to detect different types of drift. Get hands-on with the latest in deployment architectures and discover methods for scaling up your solutions. This edition goes deeper in all aspects of ML engineering and MLOps, with emphasis on the latest open-source and cloud-based technologies. This includes a completely revamped approach to advanced pipelining and orchestration techniques. With a new chapter on deep learning, generative AI, and LLMOps, you will learn to use tools like LangChain, PyTorch, and Hugging Face to leverage LLMs for supercharged analysis. You will explore AI assistants like GitHub Copilot to become more productive, then dive deep into the engineering considerations of working with deep learning.

Who is this book for?

This book is designed for MLOps and ML engineers, data scientists, and software developers who want to build robust solutions that use machine learning to solve real-world problems. If you’re not a developer but want to manage or understand the product lifecycle of these systems, you’ll also find this book useful. It assumes a basic knowledge of machine learning concepts and intermediate programming experience in Python. With its focus on practical skills and real-world examples, this book is an essential resource for anyone looking to advance their machine learning engineering career.

What you will learn

  • Plan and manage end-to-end ML development projects
  • Explore deep learning, LLMs, and LLMOps to leverage generative AI
  • Use Python to package your ML tools and scale up your solutions
  • Get to grips with Apache Spark, Kubernetes, and Ray
  • Build and run ML pipelines with Apache Airflow, ZenML, and Kubeflow
  • Detect drift and build retraining mechanisms into your solutions
  • Improve error handling with control flows and vulnerability scanning
  • Host and build ML microservices and batch processes running on AWS

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 31, 2023
Length: 462 pages
Edition : 2nd
Language : English
ISBN-13 : 9781837634354
Vendor :
Apache
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Aug 31, 2023
Length: 462 pages
Edition : 2nd
Language : English
ISBN-13 : 9781837634354
Vendor :
Apache
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Mex$85 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Mex$85 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total Mex$ 3,180.97
50 Algorithms Every Programmer Should Know
Mex$1025.99
Machine Learning Engineering  with Python
Mex$1025.99
Machine Learning with PyTorch and Scikit-Learn
Mex$1128.99
Total Mex$ 3,180.97 Stars icon

Table of Contents

11 Chapters
Introduction to ML Engineering Chevron down icon Chevron up icon
The Machine Learning Development Process Chevron down icon Chevron up icon
From Model to Model Factory Chevron down icon Chevron up icon
Packaging Up Chevron down icon Chevron up icon
Deployment Patterns and Tools Chevron down icon Chevron up icon
Scaling Up Chevron down icon Chevron up icon
Deep Learning, Generative AI, and LLMOps Chevron down icon Chevron up icon
Building an Example ML Microservice Chevron down icon Chevron up icon
Building an Extract, Transform, Machine Learning Use Case Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.6
(38 Ratings)
5 star 86.8%
4 star 2.6%
3 star 0%
2 star 5.3%
1 star 5.3%
Filter icon Filter
Top Reviews

Filter reviews by




hawkinflight Sep 11, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I have experience as a statistician, data scientist, software engineer, programmer, and I would say a little bit as an ML engineer. In Chapter 1, the author talks about the different roles, so I look forward to reading that to compare against my experience! haha. I don't have any experience using tools to build pipelines, so I am looking forward to reading about that. I like the content and structure of the book, and with only 9 chapters it's not overwhelming. I feel like I could get up to speed really quickly. I have familiarity with many parts, but not everything. I am interested in reading the section about "Choosing a style" - OOP or FP. I am also interested in exploring the "standard ML patterns" - data lakes, microservices, event-based designs and batching. I am interested in learning more about using AWS, so it's great that that's covered. The chapter on scaling is definitely interesting, as is the chapter on LLMs. I have watched interviews with the OpenAI and MSFT folks on the GPT models and I have interacted with ChatGPT. The LLMs look fun to try and the python code in the book looks very easy to read.I like this book a lot. It concisely convers all the points in moving from concept to solution, including what tools can be used. I think it will be a great starting point for me. I can't wait to try it out!
Amazon Verified review Amazon
Ishan Dutta Oct 30, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The width of topics covered along with the code provided makes this a great book! I liked how it started with basics of ML pipelines and went all the way to different LLMOps and so on. The explanation along with the provided diagrams make it easy to understand and retain. I highly recommend this book.
Amazon Verified review Amazon
zeroKelvin Sep 09, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
There are a lot of books out there that walk you through the steps of putting together a complex ML model using ideal data in a closed setting. This is not one of those books. ML engineering with Python is instead a comprehensive guide to the way machine learning works in practice at most companies.The book does a great job of explaining the MLops tools that almost all businesses today rely on to train, deploy, serve, and iterate on models. In my opinion, the concepts in this book are far more valuable than understanding how to use specific ML frameworks to solve problems. Simply understanding that these tools exist, and knowing how they are used will give engineers a leg up, and lead to more revenue generating impact than any gold medal kaggle model could produce on its own.
Amazon Verified review Amazon
Richard Apr 21, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I recently had the pleasure of reviewing "Machine Learning Engineering with Python - Second Edition" by Andrew McMahon. As a NASA data analyst deeply engaged with the operational side of machine learning, I found this book to be a valuable resource for professionals dedicated to mastering MLOps and managing the lifecycle of ML models. Andrew effectively uses practical examples and a thorough examination of contemporary tools and methodologies to advance this field.One of the standout features of this book is McMahon's approach to integrating Python code to clarify the mechanics behind ML algorithms. While I chose not to run the scripts verbatim, I found them incredibly useful as references, enhancing both my existing projects and new initiatives. This method greatly assisted me in understanding the intricacies of ML pipelines and applying these insights across various applications.A suggestion for future readers would be to approach the first chapter last. The book begins with advanced topics that are more comprehensible after navigating through the foundational material presented in subsequent chapters. This adjustment could help flatten the learning curve and not become discouraged at the advanced material.That said, there are areas where the book could improve. The chapter dedicated to generative AI and large language models, for instance, would benefit from additional case studies that demonstrate their practical applications within industry. Moreover, a deeper focus on the ethical considerations of deploying AI systems at scale is necessary, given the increasing importance of ethics in our field.In conclusion, Andrew McMahon’s second edition is a comprehensive guide that I highly recommend to MLOps practitioners, ML engineers, and data scientists. Its depth of content, combined with practical, real-world applications, positions it as a critical read for professionals aiming to stay at the forefront of technology. If you're in the field, this book is undoubtedly a valuable addition to your professional toolkit.
Amazon Verified review Amazon
Rajesh Sathya Kumar Apr 04, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I have been reading this book by Andy McMahon and just completed it. The book provided excellent coverage of ML Ops concepts, encompassing a wide range of ideas for building ML-powered apps.The Second Edition of this book also covers concepts from LLM and LLMOps. It also includes deeper content in every chapter. The amount of AI developments from 2021 (First edition) to 2023 (Second edition) is very evident from this book and makes it more exciting about the future.It also covers practical examples and applications built using scikit-learn, Spark, Airflow, Kubernetes, Keras, AWS, etc., and lists the key points discussed in each chapter.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.