To get the most out of this book
This book is for data engineers, data analysts, and data scientists who are familiar with Spark and have some knowledge of Azure. If you understand the basics of Spark and want to learn how to build an end-to-end data pipeline in Azure Databricks and productionize it, then this book is for you. Data scientists and business analysts with some knowledge of SQL who want to run ad hoc queries on large volumes of data using Databricks SQL will also find this book useful.
Basic knowledge of Python and PySpark is all you need to understand and execute the code:
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Azure-Databricks-Cookbook. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!