Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning on Kubernetes

You're reading from   Machine Learning on Kubernetes A practical handbook for building and using a complete open source machine learning platform on Kubernetes

Arrow left icon
Product type Paperback
Published in Jun 2022
Publisher Packt
ISBN-13 9781803241807
Length 384 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Ross Brigoli Ross Brigoli
Author Profile Icon Ross Brigoli
Ross Brigoli
Faisal Masood Faisal Masood
Author Profile Icon Faisal Masood
Faisal Masood
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)
2. Chapter 1: Challenges in Machine Learning FREE CHAPTER 3. Chapter 2: Understanding MLOps 4. Chapter 3: Exploring Kubernetes 5. Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes
6. Chapter 4: The Anatomy of a Machine Learning Platform 7. Chapter 5: Data Engineering 8. Chapter 6: Machine Learning Engineering 9. Chapter 7: Model Deployment and Automation 10. Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform
11. Chapter 8: Building a Complete ML Project Using the Platform 12. Chapter 9: Building Your Data Pipeline 13. Chapter 10: Building, Deploying, and Monitoring Your Model 14. Chapter 11: Machine Learning on Kubernetes 15. Other Books You May Enjoy

An overview of the ML platform

In this section, we will talk about the capabilities of the ML platform that you will need to consider. The aim is to make you aware of the basic building blocks that could form an ecosystem for your team to help you in your ML journey. An ML platform can be thought of as a set of components that assists in the faster development and deployment of ML models and data pipelines.

There are three main characteristics of an ML platform, as outlined here:

  • A complete ecosystem: The platform should provide an end-to-end (E2E) solution that includes data life-cycle management, ML life-cycle management, application life-cycle management, and observability.
  • Built on open standards: The platform should provide a way to extend and build on the existing baseline. Because the field is fast-moving, it is critical that you can further enhance, tailor, and optimize platforms for your specific needs.
  • Self-serving: The platform should be able to provide the resources required by teams automatically and on-demand, from hardware requests to deploying software in production. The platform automates the provisioning of resources based on enterprise controls and recovers them once the job is completed. The resources can be central processing units (CPUs), memory, or disk, or can be software such as integrated development environments (IDEs) to write code or a combination of these.

The following diagram shows the various components of an ML platform that serves different personas, allowing them to collaborate on a common platform:

Figure 1.4 – Personas and their interaction with the platform

Figure 1.4 – Personas and their interaction with the platform

Apart from the characteristics presented in Figure 1.4, the platform must have the following technical capabilities:

  • Workflow automation: The platform should have some form of workflow automation capability where both data engineers can create jobs that perform repetitive tasks such as data ingestion and preparation and data scientists can orchestrate model training and automate model deployments.
  • Security: The platform must be secured to prevent data leaks and data loss that can have a negative impact on the business.
  • Observability: We do not want to run applications without observability, whether it is a traditional application or an ML model. Deploying applications in production without observability is like riding a bike blindfolded. The platform should have a good amount of observability where you can monitor the health and performance of the entire system or sub-system in near real time. This should also include an alerting capability.
  • Logging: Logging plays a key role in understanding what happened when systems start behaving in an unexpected way. The platform must have a solid logging mechanism to allow operations teams to better support the ML project.
  • Data processing and pipelining: Because ML projects rely on a huge amount of data, the platform must include a reliable fully featured data processing and data pipelining solution that can scale horizontally.
  • Model packaging and deployment: Not all data scientists are experienced software engineers. Although some may have an experience in writing applications, it is not safe to assume that all data scientists can write production-grade applications and deploy them to production. Therefore, the platform must be able to automatically package an ML model into an application and serve it.
  • ML life cycle: The platform must also be capable of managing ML experiments, tracking performance, storing training and experiment metadata and feature sets, and versioning models. This not only allows data scientists to work efficiently, but also allows them to work collaboratively.
  • On-demand resource allocation: One important feature an ML platform should have is the capability that allows data scientists and data engineers to provision their own runtime resources automatically and on-demand. This eliminates the need for manual requisition of resources and eliminates time wasted on waiting and handovers with operations teams. The platform must allow platform users to create their own environment and to allocate the right amount of compute resources they need to do their jobs.

There are already platform products that have most, if not all, of the capabilities you have just learned about. What you will learn in the later chapters of this book is how to build one such platform based on OSS on top of Kubernetes.

You have been reading a chapter from
Machine Learning on Kubernetes
Published in: Jun 2022
Publisher: Packt
ISBN-13: 9781803241807
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime