Improving data quality using Glue Data Quality
Data quality is one of the most important data governance components that no organization can ignore. To be a world-class data-driven organization, the data being used to derive insights needs to yield a high degree of accurate results. However, data analytics platforms collect, process, and consume data from many source systems, each with their own data formats and quality challenges. Therefore, data quality is a high-priority data governance measure that needs to be implemented judiciously.
Glue Data Quality
Recently, AWS introduced another feature in the Glue service that helps with data quality right inside the data pipelines. Let’s discuss AWS Glue Data Quality by bringing up a use case from GreatFin.
Use case for data quality using AWS Glue Data Quality
One of the LOBs for GreatFin has architected a data lake on Amazon S3 and designed all the layers of the data lake. They will bring all the data from all the source...