Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Machine Learning on Kubernetes

You're reading from   Machine Learning on Kubernetes A practical handbook for building and using a complete open source machine learning platform on Kubernetes

Arrow left icon
Product type Paperback
Published in Jun 2022
Publisher Packt
ISBN-13 9781803241807
Length 384 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Ross Brigoli Ross Brigoli
Author Profile Icon Ross Brigoli
Ross Brigoli
Faisal Masood Faisal Masood
Author Profile Icon Faisal Masood
Faisal Masood
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1: The Challenges of Adopting ML and Understanding MLOps (What and Why)
2. Chapter 1: Challenges in Machine Learning FREE CHAPTER 3. Chapter 2: Understanding MLOps 4. Chapter 3: Exploring Kubernetes 5. Part 2: The Building Blocks of an MLOps Platform and How to Build One on Kubernetes
6. Chapter 4: The Anatomy of a Machine Learning Platform 7. Chapter 5: Data Engineering 8. Chapter 6: Machine Learning Engineering 9. Chapter 7: Model Deployment and Automation 10. Part 3: How to Use the MLOps Platform and Build a Full End-to-End Project Using the New Platform
11. Chapter 8: Building a Complete ML Project Using the Platform 12. Chapter 9: Building Your Data Pipeline 13. Chapter 10: Building, Deploying, and Monitoring Your Model 14. Chapter 11: Machine Learning on Kubernetes 15. Other Books You May Enjoy

Understanding the basics of Apache Spark

Apache Spark is an open source data processing engine designed for distributed large-scale processing of data. This means that if you have smaller datasets, say 10s or even a few 100s of GB, a tuned traditional database may provide faster processing times. The main differentiator for Apache Spark is its capability to perform in-memory intermediate computations, which makes Apache Spark much faster than Hadoop MapReduce.

Apache Spark is built for speed, flexibility, and ease of use. Apache Spark offers more than 70 high-level data processing operators that make it easy for data engineers to build data applications, so it is easy to write data processing logic using Apache Spark APIs. Being flexible means that Spark works as a unified data processing engine and works on several types of data workloads such as batch applications, streaming applications, interactive queries, and even ML algorithms.

Figure 5.26 shows the Apache Spark components...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime