Getting to Know Your Data
In this chapter, we explore features within the Databricks DI Platform that help improve and monitor data quality and facilitate data exploration. There are numerous approaches to getting to know your data better with Databricks. First, we cover how to oversee data quality with Delta Live Tables (DLT) to catch quality issues early and prevent the contamination of entire pipelines. We’ll take our first close look at Lakehouse Monitoring, which helps us analyze data changes over time and can alert us to changes that concern us. Lakehouse Monitoring is a big time-saver, allowing you to focus on mitigating or responding to data changes rather than creating notebooks that calculate standard metrics.
Moving on to data exploration, we will look at a couple of low-code approaches: Databricks Assistant and AutoML...