What this book covers
Chapter 1, Introducing Data Meshes, briefly covers the concepts from Zhamak Dehghani's original whitepaper and book on data mesh.
Chapter 2, Building a Data Mesh Strategy, guides you in evaluating your company’s current maturity level where analytics is concerned, aligning the company’s strategy with the business strategy, and how data mesh architecture could play a role in that.
Chapter 3, Deploying Data Mesh Using the Azure Cloud-Scale Analytics Framework, covers Microsoft’s own cloud-scale analytics framework for implementing data mesh.
Chapter 4, Building a Data Mesh Governance Framework Using Microsoft Azure Services, talks about how the key to a successful data mesh implementation is managing federated governance. This chapter will cover all the aspects of data mesh governance and align it with Microsoft Azure services that can be used to implement it.
Chapter 5, Security Architecture for Data Meshes, covers how with distributed data comes security challenges. Chapter 4 discusses network security. In this chapter, we will discuss various aspects of data security, such as access control and retention.
Chapter 6, Automating Deployment through Azure Resource Manager and Azure DevOps, looks at how with distributed data and analytics comes distributed environments and products. The key to efficiently managing your environment is automation. This chapter walks you through all the aspects of automating the deployment and management of data mesh.
Chapter 7, Building a Self-Service Portal for Common Data Mesh Operations, explores how data mesh promotes agility and innovation by democratizing data and analytical technologies. One of the ways to empower data mesh users is to give them tools to discover data and deployment environments. A common practice is to build a self-service data mesh portal. This chapter provides guidance on how to design and build a self-service portal.
Chapter 8, How to Design, Build, and Manage Data Contracts, looks at how data mesh federates data ownership. Each team is responsible for the quality and reliability of their own data. In such a scenario, how do you build trust? This chapter discusses the formal method and process of maintaining data contracts and SLAs that help build trust and increase the reliability of data mesh.
Chapter 9, Data Quality Management, explores how, as data mesh grows, data products become dependent on each other for their outcomes. Some of these products deliver key analytics that is critical to business operations. The bad data quality of one data product could impact multiple products. This chapter showcases how to build/buy an enterprise-class data quality management system.
Chapter 10, Master Data Management, looks at Master Data Management (MDM), which provides a unified, consistent view of critical data entities across the organization; this is essential for data mesh’s principle of domain-oriented decentralized data ownership and architecture. In this chapter, we will look at buy-and-build options for MDM for data mesh.
Chapter 11, Monitoring and Data Observability, covers monitoring and data observability, which are crucial for data mesh as they enable real-time insights into the health, performance, and reliability of data across decentralized domains. It is also one of the most challenging features to implement. It involves monitoring data products and data. In this chapter, we will design a Data Mesh Operations Center (DMOC) to consolidate all the monitoring aspects into one pane of glass.
Chapter 12, Monitoring Data Mesh Costs and Building a Cross-Charging Model, covers how analytical systems are typically cost centers. They are investments, and there are many ways to manage and distribute costs. This chapter looks at various cost models, systems of monitoring costs, and ways of distributing the costs of shared and individual components.
Chapter 13, Understanding Data-Sharing Topologies in a Data Mesh, looks at how one of the features of data mesh is to minimize the movement of data across the enterprise. It introduces the concept of in-place sharing. However, in-place sharing has its limitations and challenges. This chapter discusses various data-sharing topologies and describes the different scenarios for using each topology.
Chapter 14, Advanced Analytics Using Azure Machine Learning, Databricks, and the Lakehouse Architecture, is a reference chapter that describes one of the most commonly used architectures for advanced analytics: the lakehouse architecture. The lakehouse architecture combines the scalable storage capabilities of a data lake with the data management and ACID transaction features of a data warehouse, enabling both analytical and transactional workloads on the same platform.
Chapter 15, Big Data Analytics Using Azure Synapse Analytics, covers how big data processing is a common scenario in most companies today. This reference chapter discusses a possible architecture with Azure Synapse Analytics.
Chapter 16, Event-Driven Analytics Using Azure Event Hubs, Azure Stream Analytics, and Azure Machine Learning, looks at how certain areas, such as social media data analysis, logistics, and supply chain, require the real-time or near-real-time analysis of data. This kind of data processing needs different kinds of services and storage. This chapter discusses these event processing components and how to lay them out in a real-time analytics architecture.
Chapter 17, AI Using Azure Cognitive Services and Azure OpenAI, looks at how AI and machine learning have very different needs when it comes to data processing. They need quick cycles of training and re-training as data and models drift with time. Large language models bring in concepts such as prompt engineering and chaining. This chapter describes modern architectures for how to build Azure Cognitive Services- and Azure OpenAI-based models for natural-language-based interactions with your corporate data.