Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Databricks ML in Action
Databricks ML in Action

Databricks ML in Action: Learn how Databricks supports the entire ML lifecycle end to end from data ingestion to the model deployment

Arrow left icon
Profile Icon Stephanie Rivera Profile Icon Anastasia Prokaieva Profile Icon Amanda Baker Profile Icon Hayley Horn
Arrow right icon
$9.99 $35.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7 (10 Ratings)
eBook May 2024 280 pages 1st Edition
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Stephanie Rivera Profile Icon Anastasia Prokaieva Profile Icon Amanda Baker Profile Icon Hayley Horn
Arrow right icon
$9.99 $35.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7 (10 Ratings)
eBook May 2024 280 pages 1st Edition
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $35.99
Paperback
$44.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Databricks ML in Action

Getting Started with This Book and Lakehouse Concepts

“Give me six hours to chop down a tree, and I will spend the first four sharpening the axe.”

– Abraham Lincoln

We will start with a basic overview of how Databrick’s Data Intelligence Platform (DI) is an open platform on a lakehouse architecture and the advantages of this in developing machine learning (ML) applications. For brevity, we will use terms such as Data Intelligence Platform and Databricks interchangeably throughout the book. This chapter will introduce the different projects and associated datasets we’ll use throughout the book. Each project intentionally highlights a function or component of the DI Platform. Use the example projects as hands-on lessons for each platform element we cover. We progress through these projects in the last section of each chapter – namely, applying our learning.

Here is what you will learn in this chapter:

  • The components of the Data...

The components of the Data Intelligence Platform

The Data Intelligence Platform allows your entire organization to leverage your data and AI. It’s built on a lakehouse architecture to provide an open, unified foundation for all data and governance layers. It is powered by a Data Intelligence Engine, which understands the context of your data. For practical purposes, let’s talk about the components of the Databricks Data Intelligence Platform:

Figure 1.1 – The components of the Databricks Data Intelligence Platform

Figure 1.1 – The components of the Databricks Data Intelligence Platform

Let’s check out the following list with the descriptions of the items in the figure:

  • Delta Lake: The data layout within the Data Intelligence Platform is automatically optimized based on common data usage patterns
  • Unity Catalog: A unified governance model to secure, manage, and share your data assets
  • Data Intelligence Engine: This uses AI to enhance the platform’s capabilities
  • Databricks...

The advantages of the Databricks Platform

Databricks’ implementation of a lakehouse architecture is unique. Databricks’ foundation is built on a Delta-formatted data lake that Unity Catalog governs. Therefore, it combines a data lake’s scalability and cost-effectiveness with a data warehouse’s governance. This means not only are table-level permissions managed through access control lists (ACLs) but file and object-level access are also regulated. This change in architecture from a data lake and/or a data warehouse to a unified platform is ideal – a lakehouse facilitates a wide range of new use cases for analytics, business intelligence, and data science projects across an organization. See the Introduction to Data Lakes blog post in the Further reading section for more information on lakehouse benefits.

This section will discuss the importance of open source frameworks and two critical advantages they provide – transparency and flexibility...

Applying our learning

This book is heavily project-based. Each chapter starts with an overview of the important concepts and Data Intelligence Platform features that will prepare you for the main event – the Applying our learning sections. Every Applying our learning section has a Technical requirements section so that you know what technical resources you will need, in addition to your Databricks workspace and GitHub repository, to complete the project work in the respective chapter.

Technical requirements

Here are the technical requirements needed to get started with the hands-on examples used throughout this book:

  • We use Kaggle for two of our datasets. If you do not already have an account, you will need to create one.
  • Throughout the book, we will refer to code in GitHub. Create an account if you do not already have one.

Getting to know your data

There are four main projects that progress sequentially throughout the book. In each subsequent chapter...

Summary

In this chapter, we introduced you to Databricks ML in Action. We emphasized that the Databricks Data Intelligence Platform is designed with openness, flexibility, and tooling freedom in mind, which greatly accelerates productivity. Additionally, we’ve given you a sneak peek at the projects and the associated datasets that will be central to this book.

Now that you’ve gained a foundational understanding of the Data Intelligence Platform, it’s time to take the next step. In the upcoming chapter, we’ll guide you through setting up your environment and provide instructions on downloading the project data. This will prepare you for the practical, hands-on ML experiences that lie ahead in this journey.

Questions

Let’s test ourselves on what we’ve learned by going through the following questions:

  1. How will you use this book? Do you plan to go cover to cover or pick certain sections out? Have you chosen sections of interest?
  2. We covered why transparency in modeling is critical to success. How does Databricks’ glass-box approach to AutoML support this?
  3. Databricks has developed a new way of uniting the open data formats, called UniForm. Which data formats does UniForm unite?
  4. Delta is the foundation of the lakehouse architecture. What is one of the benefits of using Delta?
  5. What is the main advantage of using the Delta file format for large-scale data processing over simple Parquet files in Databricks?

Answers

After putting thought into the questions, compare your answers to ours:

  1. We cannot answer this question, but we hope you learn something you can use in your career soon!
  2. The glass-box approach supports transparency by providing the code run for each run in the experiment and the best run, thus enabling reusability and reproducibility.
  3. Apache Iceberg, Apache Hudi, and Linux Foundation Delta Lake (an open source/unmanaged version of Delta).
  4. There are several. Here are a few:
    • Open protocol (no vendor lock-in)
    • Speed
    • Change data capture
    • Time travel
  5. While Parquet also provides columnar storage and has efficient read/write operations, its lack of ACID transaction capabilities distinguishes Delta Lake.

Further reading

In this chapter, we introduced vital technologies. Look at these resources to go deeper into the areas that interest you most:

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Build machine learning solutions faster than peers only using documentation
  • Enhance or refine your expertise with tribal knowledge and concise explanations
  • Follow along with code projects provided in GitHub to accelerate your projects
  • Purchase of the print or Kindle book includes a free PDF eBook

Description

Discover what makes the Databricks Data Intelligence Platform the go-to choice for top-tier machine learning solutions. Written by a team of industry experts at Databricks with decades of combined experience in big data, machine learning, and data science, Databricks ML in Action presents cloud-agnostic, end-to-end examples with hands-on illustrations of executing data science, machine learning, and generative AI projects on the Databricks Platform. You’ll develop expertise in Databricks' managed MLflow, Vector Search, AutoML, Unity Catalog, and Model Serving as you learn to apply them practically in everyday workflows. This Databricks book not only offers detailed code explanations but also facilitates seamless code importation for practical use. You’ll discover how to leverage the open-source Databricks platform to enhance learning, boost skills, and elevate productivity with supplemental resources. By the end of this book, you'll have mastered the use of Databricks for data science, machine learning, and generative AI, enabling you to deliver outstanding data products.

Who is this book for?

This book is for machine learning engineers, data scientists, and technical managers seeking hands-on expertise in implementing and leveraging the Databricks Data Intelligence Platform and its Lakehouse architecture to create data products.

What you will learn

  • Set up a workspace for a data team planning to perform data science
  • Monitor data quality and detect drift
  • Use autogenerated code for ML modeling and data exploration
  • Operationalize ML with feature engineering client, AutoML, VectorSearch, Delta Live Tables, AutoLoader, and Workflows
  • Integrate open-source and third-party applications, such as OpenAI's ChatGPT, into your AI projects
  • Communicate insights through Databricks SQL dashboards and Delta Sharing
  • Explore data and models through the Databricks marketplace

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 17, 2024
Length: 280 pages
Edition : 1st
Language : English
ISBN-13 : 9781800564008
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : May 17, 2024
Length: 280 pages
Edition : 1st
Language : English
ISBN-13 : 9781800564008
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 144.97
Data Engineering with Databricks Cookbook
$49.99
Building LLM Powered  Applications
$49.99
Databricks ML in Action
$44.99
Total $ 144.97 Stars icon
Banner background image

Table of Contents

12 Chapters
Part 1: Overview of the Databricks Unified Data Intelligence Platform Chevron down icon Chevron up icon
Chapter 1: Getting Started and Lakehouse Concepts Chevron down icon Chevron up icon
Chapter 2: Designing Databricks: Day One Chevron down icon Chevron up icon
Chapter 3: Building the Bronze Layer Chevron down icon Chevron up icon
Part 2: Heavily Project Focused Chevron down icon Chevron up icon
Chapter 4: Getting to Know Your Data Chevron down icon Chevron up icon
Chapter 5: Feature Engineering on Databricks Chevron down icon Chevron up icon
Chapter 6: Tools for Model Training and Experimenting Chevron down icon Chevron up icon
Chapter 7: Productionizing ML on Databricks Chevron down icon Chevron up icon
Chapter 8: Monitoring, Evaluating, and More Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7
(10 Ratings)
5 star 90%
4 star 0%
3 star 0%
2 star 10%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




N/A Jun 21, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Feefo Verified review Feefo
Steven Fernandes Jun 25, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
A good resource for setting up a data science workspace, monitoring data quality, and detecting drift. It covers the use of autogenerated code for ML modeling, operationalizing ML with advanced tools like AutoML, VectorSearch, and Delta Live Tables, and integrating applications like OpenAI's ChatGPT. The book also emphasizes communicating insights with Databricks SQL dashboards and Delta Sharing and exploring data and models through the Databricks marketplace. A must-read for efficiently managing and scaling AI projects.
Amazon Verified review Amazon
Ramiro May 29, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Writer Stephanie Rivera is a masterful writer whose exploration of artificial intelligence pushes the boundaries of the genre and challenges readers to consider the implications of a future shaped by AI.
Amazon Verified review Amazon
alisha Jun 15, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is a comprehensive guide for anyone looking to harness the power of Databricks for machine learning and data science.Here are some key takeaways:Hands-On Learning: The book offers practical examples and step-by-step tutorials, making complex concepts accessible to beginners and experienced practitioners.Real-World Applications: It dives into real-world use cases, showcasing how companies leverage Databricks to solve business problems, optimize processes, and drive innovation.Comprehensive Coverage: The book covers the entire machine learning lifecycle from data engineering and exploratory data analysis to model building and deployment.Integration and Scalability: It highlights how Databricks seamlessly integrates with other tools and technologies, ensuring scalable and efficient workflows.Expert Insights: Written by industry experts, the book provides deep insights into best practices and advanced techniques, empowering readers to elevate their ML projects.Whether you're a data scientist, ML engineer, or analyst,"Databricks ML in Action" is a valuable resource to enhance your skillset and knowledge.
Amazon Verified review Amazon
Ryan Aug 23, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
In depth instruction about the platform with 4 specific practical examples:- A time series forecaster- A feature engineered stream of transactions- A multi-label image classifier- A RAG augmented chatbotVery good book!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.