Preface
Designed for seasoned data scientists and developers, this book is your definitive guide to leveraging Databricks for end-to-end machine learning projects. Assuming a robust foundation in Python, statistics, machine learning life cycles, and an introductory understanding of Spark, this resource aims to transition professionals from DIY environments or other cloud platforms to the Databricks ecosystem.
Kick off your journey with a succinct overview of the machine learning landscape, followed by a deep dive into Databricks’ features and the MLflow framework. Navigate through crucial elements including data preparation, model selection, and training, all while exploiting Databricks feature stores for efficient feature engineering. Employ Databricks AutoML to swiftly initiate your projects and learn how to automate model retraining and deployment via Databricks workflows.
By the close of this book, you’ll be well versed in utilizing MLflow for experiment tracking, inter-team collaboration, and addressing advanced needs such as model interpretability and governance. The book is laden with practical code examples and focuses on current, generally available features, yet equips you to adapt swiftly to emerging technologies in machine learning, Databricks, and MLflow.