You're reading from Engineering Data Mesh in Azure Cloud Implement data mesh using Microsoft Azure's Cloud Adoption Framework

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781805120780

Length 314 pages

Edition 1st Edition

Languages

Python

Tools

Azure

Concepts

Data Science

Author (1):

Deswandikar

View More author details

Table of Contents (23) Chapters

Preface

1. Part 1: Rolling Out the Data Mesh in the Azure Cloud

2. Chapter 1: Introducing Data Meshes FREE CHAPTER

3. Chapter 2: Building a Data Mesh Strategy

4. Chapter 3: Deploying a Data Mesh Using the Azure Cloud-Scale Analytics Framework

5. Chapter 4: Building a Data Mesh Governance Framework Using Microsoft Azure Services

6. Chapter 5: Security Architecture for Data Meshes

7. Chapter 6: Automating Deployment through Azure Resource Manager and Azure DevOps

8. Chapter 7: Building a Self-Service Portal for Common Data Mesh Operations

9. Part 2: Practical Challenges of Implementing a Data Mesh

10. Chapter 8: How to Design, Build, and Manage Data Contracts

11. Chapter 9: Data Quality Management

12. Chapter 10: Master Data Management

13. Chapter 11: Monitoring and Data Observability

14. Chapter 12: Monitoring Data Mesh Costs and Building a Cross-Charging Model

15. Chapter 13: Understanding Data-Sharing Topologies in a Data Mesh

16. Part 3: Popular Data Product Architectures

17. Chapter 14: Advanced Analytics Using Azure Machine Learning, Databricks, and the Lakehouse Architecture

18. Chapter 15: Big Data Analytics Using Azure Synapse Analytics

19. Chapter 16: Event-Driven Analytics Using Azure Event Hubs, Azure Stream Analytics, and Azure Machine Learning

20. Chapter 17: AI Using Azure Cognitive Services and Azure OpenAI

21. Index

Why subscribe?

22. Other Books You May Enjoy

Collecting and managing metadata

In the previous section, we looked at how data can be cataloged using Microsoft Purview. The built-in Microsoft Purview scanners scan and ingest basic technical metadata from data sources. This includes file types, column names, column types, and basic out-of-the-box classifications. However, this initial technical metadata is extracted from the data source purely based on the definitions available in the data source itself. Some data sources, such as Microsoft SQL Server, maintain significant amounts of data relating to the schema and its relationships. But others, such as CSV files stored in blob storage, do not have any information other than a column header. Hence, after the initial scan and ingest cycle, the governance team needs to get to work editing and enhancing the metadata to make the data assets more meaningful.

The real advantage of cataloging data and making it searchable is to make data more meaningful to the users. Users searching...