Leveraging automatic constraint suggestion
Deequ provides a powerful feature where it can analyze the data and suggest constraints that can be applied as checks. To see how it works, we will be using the flights
data once again. In Chapter 4, we defined an interface to work with databases that we are going to use to create a dataframe. We will then pass the dataframe into ConstraintSuggestionRunner
in order for Deequ to suggest constraints.
Here is the complete code for it:
package com.packt.dewithscala.chapter7 import com.packt.dewithscala.utils._ import com.amazon.deequ.suggestions.ConstraintSuggestionResult import com.amazon.deequ.suggestions.ConstraintSuggestionRunner import com.amazon.deequ.suggestions.Rules import org.apache.spark.sql.functions._ import org.apache.spark.sql.SparkSession import org.apache.spark.sql.DataFrame object ConstraintSuggestion extends App { val session: SparkSession = Spark.initSparkSession("de-with-scala") val...