You're reading from Engineering Data Mesh in Azure Cloud Implement data mesh using Microsoft Azure's Cloud Adoption Framework

Product type Paperback

Published in Mar 2024

Publisher Packt

ISBN-13 9781805120780

Length 314 pages

Edition 1st Edition

Languages

Python

Tools

Azure

Concepts

Data Science

Author (1):

Aniruddha Deswandikar

View More author details

Table of Contents (23) Chapters

Preface

1. Part 1: Rolling Out the Data Mesh in the Azure Cloud FREE CHAPTER

2. Chapter 1: Introducing Data Meshes

3. Chapter 2: Building a Data Mesh Strategy

4. Chapter 3: Deploying a Data Mesh Using the Azure Cloud-Scale Analytics Framework

5. Chapter 4: Building a Data Mesh Governance Framework Using Microsoft Azure Services

6. Chapter 5: Security Architecture for Data Meshes

7. Chapter 6: Automating Deployment through Azure Resource Manager and Azure DevOps

8. Chapter 7: Building a Self-Service Portal for Common Data Mesh Operations

9. Part 2: Practical Challenges of Implementing a Data Mesh

10. Chapter 8: How to Design, Build, and Manage Data Contracts

11. Chapter 9: Data Quality Management

12. Chapter 10: Master Data Management

13. Chapter 11: Monitoring and Data Observability

14. Chapter 12: Monitoring Data Mesh Costs and Building a Cross-Charging Model

15. Chapter 13: Understanding Data-Sharing Topologies in a Data Mesh

16. Part 3: Popular Data Product Architectures

17. Chapter 14: Advanced Analytics Using Azure Machine Learning, Databricks, and the Lakehouse Architecture

18. Chapter 15: Big Data Analytics Using Azure Synapse Analytics

19. Chapter 16: Event-Driven Analytics Using Azure Event Hubs, Azure Stream Analytics, and Azure Machine Learning

20. Chapter 17: AI Using Azure Cognitive Services and Azure OpenAI

21. Index

Why subscribe?

22. Other Books You May Enjoy

Hosting common data pipeline templates

After exploring the data mesh and finding the right data for their project, the next step for data product teams is to access that data directly or move that data to their data product landing zone. Small or medium-sized data kept in databases or data lakes can sometimes be directly accessed into a Python workbook by using a connection string and reading the data into a data frame. But for large datasets and data coming from on-premise legacy systems or enterprise resource planning (ERP) and customer relationship management (CRM) systems hosted outside the data mesh, you need pipelines.

In Azure, these pipelines are typically built using Azure Data Factory. While sources for these pipelines are common across data products, the type of storage where this data is deposited is also quite standard. It’s either a data lake or an SQL database that is typically used to store this data. If each data product team starts building pipelines to...