You're reading from Machine Learning on Kubernetes A practical handbook for building and using a complete open source machine learning platform on Kubernetes

Product type Paperback

Published in Jun 2022

Publisher Packt

ISBN-13 9781803241807

Length 384 pages

Edition 1st Edition

Languages

Python

Tools

Kubernetes

Concepts

Machine Learning

Authors (2):

Ross Brigoli

Faisal Masood

View More author details

Table of Contents (16) Chapters

Preface

1. Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)

2. Chapter 1: Challenges in Machine Learning FREE CHAPTER

3. Chapter 2: Understanding MLOps

4. Chapter 3: Exploring Kubernetes

5. Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes

6. Chapter 4: The Anatomy of a Machine Learning Platform

7. Chapter 5: Data Engineering

8. Chapter 6: Machine Learning Engineering

9. Chapter 7: Model Deployment and Automation

10. Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform

11. Chapter 8: Building a Complete ML Project Using the Platform

12. Chapter 9: Building Your Data Pipeline

13. Chapter 10: Building, Deploying, and Monitoring Your Model

14. Chapter 11: Machine Learning on Kubernetes

15. Other Books You May Enjoy

Choosing the right approach

Before deciding to use ML for a given project, understand the problem first and assess if it can be solved by ML. Invest enough time in working with the right stakeholder to see what the expectations are. Some problems may be better suited to traditional approaches, such as when you have predefined business rules for a given system. It is faster and easier to code rules than is it to train a model, plus you do not need a huge amount of data.

While deciding whether to use ML or not, you can think in terms of whether pattern-based results will work for your problem. If you are building a system that reads data from the frequent-flyer database of an airline to find customers to which you want to send a promotion, a rule-based system may also give you good and acceptable results. An ML-based system may give you better matches for certain scenarios, but will the time spent on building this system be worth it?

The importance of data

The efficiency of your ML model depends on the quality and accuracy of the data, but unfortunately, data collection and processing activities do not get the attention they deserve, which proves costly in later stages of the project in terms of the model not being suitable enough for the given task.

"Everyone wants to do the model work, not the data work."

– Data Cascades in High-Stakes AI, Sambasivan et al. (see the Further reading section)

The paper cited here discusses this challenge. An interesting example quoted in the paper is of a team building a model to detect a particular pattern from patient scans, which works brilliantly with test data. However, the model failed in production because the scans being fed onto the model contained tiny dust particles, resulting in the inferior performance of the model. This example is a classic case of a team being focused on model building and not on how it will be used in the real world.

One thing that teams should put focus on is data validation and cleansing. Many times, data is often missing or is not correct—for example, a string field in a number column, different date formats in the same field, or the same identifier (ID) for different records if the records come from different systems. All this data anomaly may result in an inefficient model that will lead to inferior performance.

Once you've been through this process and come to the decision that yes, ML is the way to go… what next?

You're reading from Machine Learning on Kubernetes A practical handbook for building and using a complete open source machine learning platform on Kubernetes

Table of Contents (16) Chapters

Choosing the right approach

The importance of data

Authors (2)

Personalised recommendations for you