You're reading from Cloud Scale Analytics with Azure Data Services Build modern data warehouses on Microsoft Azure

Product type Paperback

Published in Jul 2021

Publisher Packt

ISBN-13 9781800562936

Length 520 pages

Edition 1st Edition

Tools

Azure

Concepts

Data Streaming

Author (1):

Patrik Borosch

View More author details

Table of Contents (20) Chapters

Preface

1. Section 1: Data Warehousing and Considerations Regarding Cloud Computing

2. Chapter 1: Balancing the Benefits of Data Lakes Over Data Warehouses FREE CHAPTER

3. Chapter 2: Connecting Requirements and Technology

4. Section 2: The Storage Layer

5. Chapter 3: Understanding the Data Lake Storage Layer

6. Chapter 4: Understanding Synapse SQL Pools and SQL Options

7. Section 3: Cloud-Scale Data Integration and Data Transformation

8. Chapter 5: Integrating Data into Your Modern Data Warehouse

9. Chapter 6: Using Synapse Spark Pools

10. Chapter 7: Using Databricks Spark Clusters

11. Chapter 8: Streaming Data into Your MDWH

12. Chapter 9: Integrating Azure Cognitive Services and Machine Learning

13. Chapter 10: Loading the Presentation Layer

14. Section 4: Data Presentation, Dashboarding, and Distribution

15. Chapter 11: Developing and Maintaining the Presentation Layer

16. Chapter 12: Distributing Data

17. Chapter 13: Introducing Industry Data Models

18. Chapter 14: Establishing Data Governance

19. Other Books You May Enjoy

Using data lineage

Once the data factory is connected, it will send lineage information into your Purview environment for every pipeline that is run. Give it a try and create a Data Factory pipeline that copies data from one folder to another in your data lake. Remember: you are quickest when you use the Copy Data Wizard (or just use the MyFirstPipeline pipeline that you created in Chapter 5, Integrating Data in Your Modern Data Warehouse, if you used the data factory there).

When you are finished in the data factory, switch back to Purview, repeat your scan (again, this might take a few minutes), and search for the newly created file or the pipeline name, and in the asset details, check the Lineage tab:

Figure 14.22 – First lineage overview for a Data Factory Copy pipeline

When you check the lineage closely, you will see that you can drill down to the column level and reveal even the column mappings.

Imagine the power of this feature, when you...