You're reading from Azure Data Engineer Associate Certification Guide Ace the DP-203 exam with advanced data engineering skills

Product type Paperback

Published in May 2024

Publisher Packt

ISBN-13 9781805124689

Length 548 pages

Edition 2nd Edition

Languages

SQL

Tools

Azure

Concepts

Big Data

Authors (3):

Newton Alex

Giacinto Palmieri

Mr. Surendra Mettapalli

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Azure Basics FREE CHAPTER

2. Chapter 1: Introducing Azure Basics

3. Part 2: Data Storage

4. Chapter 2: Implementing a Partition Strategy

5. Chapter 3: Designing and Implementing the Data Exploration Layer

6. Part 3:Data Processing

7. Chapter 4: Ingesting and Transforming Data

8. Chapter 5: Developing a Batch Processing Solution

9. Chapter 6: Developing a Stream Processing Solution

10. Chapter 7: Managing Batches and Pipelines

11. Part 4:Secure, Monitor, and Optimize Data Storage and Processing

12. Chapter 8: Implementing Data Security

13. Chapter 9: Monitoring Data Storage and Data Processing

14. Chapter 10: Optimizing and Troubleshooting Data Storage and Data Processing

15. Chapter 11: Accessing the Online Practice Resources

16. Other Books You May Enjoy

Integrating Jupyter/Python Notebooks into a Data Pipeline

Integrating Jupyter/Python notebooks into a data pipeline provides flexibility, transparency, and efficiency throughout the data-processing life cycle. It bridges the gap between your exploration, development, and production, making it an essential practice in your data-engineering workflows. Integrating Jupyter/Python notebooks into your ADF data pipeline can be done using the Spark activity in ADF. You will need an Azure HDInsight Spark cluster for this example.

Note

This section primarily focuses on the Integrate Jupyter or Python notebooks into a data pipeline concept of the DP-203: Data Engineering on Microsoft Azure exam.

The prerequisites for integrating Jupyter notebooks are as follows:

Create linked services to Azure Storage
Create HDInsight from ADF
Have an HDInsight Spark cluster running

Note

You have already learned how linked services are created in the Data Ingestion section,...

The rest of the chapter is locked

You're reading from Azure Data Engineer Associate Certification Guide Ace the DP-203 exam with advanced data engineering skills

Table of Contents (17) Chapters

Integrating Jupyter/Python Notebooks into a Data Pipeline

Unlock this book and the full library FREE for 7 days

Authors (3)

Personalised recommendations for you