You're reading from Azure Data Engineering Cookbook Get well versed in various data engineering techniques in Azure using this recipe-based guide

Product type Paperback

Published in Sep 2022

Publisher Packt

ISBN-13 9781803246789

Length 608 pages

Edition 2nd Edition

Languages

SQL

Tools

Azure

Concepts

Data Engineering

Authors (3):

Ahmad Osama

Nagaraj Venkatesan

Luca Zanna

View More author details

Table of Contents (16) Chapters

Preface

1. Chapter 1: Creating and Managing Data in Azure Data Lake

2. Chapter 2: Securing and Monitoring Data in Azure Data Lake FREE CHAPTER

3. Chapter 3: Building Data Ingestion Pipelines Using Azure Data Factory

4. Chapter 4: Azure Data Factory Integration Runtime

5. Chapter 5: Configuring and Securing Azure SQL Database

6. Chapter 6: Implementing High Availability and Monitoring in Azure SQL Database

7. Chapter 7: Processing Data Using Azure Databricks

8. Chapter 8: Processing Data Using Azure Synapse Analytics

9. Chapter 9: Transforming Data Using Azure Synapse Dataflows

10. Chapter 10: Building the Serving Layer in Azure Synapse SQL Pool

11. Chapter 11: Monitoring Synapse SQL and Spark Pools

12. Chapter 12: Optimizing and Maintaining Synapse SQL and Spark Pools

13. Chapter 13: Monitoring and Maintaining Azure Data Engineering Pipelines

14. Index

Why subscribe?

15. Other Books You May Enjoy

What this book covers

Chapter 1, Creating and Managing Data in Azure Data Lake, focuses on provisioning, uploading, and managing the data life cycle in Azure Data Lake accounts.

Chapter 2, Securing and Monitoring Data in Azure Data Lake, covers securing an Azure Data Lake account using firewall and private links, accessing data lake accounts using managed identities, and monitoring an Azure Data Lake account using Azure Monitor.

Chapter 3, Building Data Ingestion Pipelines Using Azure Data Factory, covers ingesting data using Azure Data Factory and copying data between Azure SQL Database and Azure Data Lake.

Chapter 4, Azure Data Factory Integration Runtime, focuses on configuring and managing self-hosted integration runtimes and running SSIS packages in Azure using Azure-SSIS integration runtimes.

Chapter 5, Configuring and Securing Azure SQL Database, covers configuring a Serverless SQL database, Hyperscale SQL database, and securing Azure SQL Database using virtual networks and private links.

Chapter 6, Implementing High Availability and Monitoring in Azure SQL Database, explains configuring high availability to Azure SQL Database using auto-failover groups and read replicas, monitoring Azure SQL Database, and the automated scaling of Azure SQL Database during utilization spikes.

Chapter 7, Processing Data Using Azure Databricks, covers integrating Azure Databricks with Azure Data Lake and Azure Key Vault, processing data using Databricks notebooks, working with Delta tables, and visualizing Delta tables using Power BI.

Chapter 8, Processing Data Using Azure Synapse Analytics covers exploring data using Synapse Serverless SQL pool, processing data using Synapse Spark Pools, Working with Synapse Lake database, and integrating Synapse Analytics with Power BI.

Chapter 9, Transforming Data Using Azure Synapse Dataflows, focuses on performing transformations using Synapse Dataflows, optimizing data flows using partitioning, and managing dynamic source schema changes using schema drifting.

Chapter 10, Building the Serving Layer in Azure Synapse SQL Pools, covers loading processed data into Synapse dedicated SQL pools, performing data archival using partitioning, managing table distributions, and optimizing performance using statistics and workload management.

Chapter 11, Monitoring Synapse SQL and Spark Pools, covers monitoring Synapse dedicated SQL and Spark pools using Azure Log Analytics workbooks, Kusto scripts, and Azure Monitor, and monitoring Synapse dedicated SQL pools using Dynamic Management Views (DMVs).

Chapter 12, Optimizing and Maintaining Synapse SQL and Spark Pools, offers techniques for tuning query performance by optimizing query plans, rebuilding replication caches and maintenance scripts to optimize Delta tables, and automatically pausing SQL pools during inactivity, among other things.

Chapter 13, Monitoring and Maintaining Azure Data Engineering Pipelines, covers monitoring and managing end-to-end data engineering pipelines, which includes tracking data lineage using Microsoft Purview and improving the observability of pipeline executions using log analytics and query labeling.