The ML Process and Its Challenges
Welcome to the world of simplifying your machine learning (ML) life cycle with the Databricks platform.
As a senior specialist solutions architect at Databricks specializing in ML, over the years, I have had the opportunity to collaborate with enterprises to architect ML-capable platforms to solve their unique business use cases using the Databricks platform. Now, that experience will be at your service to learn from. The knowledge you will gain from this book will open new career opportunities for you and change how you approach architecting ML pipelines for your organization’s ML use cases.
This book does assume that you have a reasonable understanding of the Python language as the accompanying code samples will be in Python. This book is not about teaching you ML techniques from scratch; it is assumed that you are an experienced data science practitioner who wants to learn how to take your ML use cases from development to production and all the steps in the middle using the Databricks platform.
For this book, some Python and pandas know-how is required. Being familiar with Apache Spark is a plus, and having a solid grasp of ML and data science is necessary.
Note
This book focuses on the features that are currently generally available. The code examples provided utilize Databricks notebooks. While Databricks is actively developing features to support workflows using external integrated development environments (IDEs), these specific features are not covered in this book. Also, going through this book will give you a solid foundation to quickly pick up new features as they become GA.
In this chapter, we will cover the following:
- Understanding the typical ML process
- Discovering the personas involved with the machine learning process in organizations
- Challenges with productionizing machine learning use cases in organizations
- Understanding the requirements of an enterprise machine learning platform
- Exploring Databricks and the Lakehouse architecture
By the end of this chapter, you should have a fundamental understanding of what a typical ML development life cycle looks like in an enterprise and the different personas involved in it. You will also know why most ML projects fail to deliver business value and how the Databricks Lakehouse Platform provides a solution.