Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Azure Data Factory Cookbook

You're reading from   Azure Data Factory Cookbook Build ETL, Hybrid ETL, and ELT pipelines using ADF, Synapse Analytics, Fabric and Databricks

Arrow left icon
Product type Paperback
Published in Feb 2024
Publisher Packt
ISBN-13 9781803246598
Length 532 pages
Edition 2nd Edition
Tools
Arrow right icon
Authors (4):
Arrow left icon
Tonya Chernyshova Tonya Chernyshova
Author Profile Icon Tonya Chernyshova
Tonya Chernyshova
Xenia Ireton Xenia Ireton
Author Profile Icon Xenia Ireton
Xenia Ireton
Dmitry Foshin Dmitry Foshin
Author Profile Icon Dmitry Foshin
Dmitry Foshin
Dmitry Anoshin Dmitry Anoshin
Author Profile Icon Dmitry Anoshin
Dmitry Anoshin
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Getting Started with ADF 2. Orchestration and Control Flow FREE CHAPTER 3. Setting Up Synapse Analytics 4. Working with Data Lake and Spark Pools 5. Working with Big Data and Databricks 6. Data Migration – Azure Data Factory and Other Cloud Services 7. Extending Azure Data Factory with Logic Apps and Azure Functions 8. Microsoft Fabric and Power BI, Azure ML, and Cognitive Services 9. Managing Deployment Processes with Azure DevOps 10. Monitoring and Troubleshooting Data Pipelines 11. Working with Azure Data Explorer 12. The Best Practices of Working with ADF 13. Other Books You May Enjoy
14. Index

Technical requirements

NOTE

To make fully understanding the recipes easier, we make naming suggestions for the accounts, pipelines, and so on throughout the chapter. Many services, such as Azure Storage and SQL Server, require that the names you assign are unique. Follow your own preferred naming conventions, making appropriate substitutions as you follow the recipes. For the Azure resource naming rules, refer to the documentation at https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/resource-name-rules.

In addition to Azure Data Factory, we shall be using three other Azure services: Logic Apps, Blob Storage, and Azure SQL Database. You will need to have Azure Blob Storage and Azure SQL Database accounts set up to follow the recipes. The following steps describe the necessary preparation:

  • Create an Azure Blob Storage account and name it adforchestrationstorage. When creating the storage account, select the same region (that is, East US) as you selected when you created the Data Factory instance. This will reduce our costs when moving data.
  • Create a container named data within this storage account, and upload two CSV files to the folder: airlines.csv and countries.csv (the files can be found on GitHub: https://github.com/PacktPublishing/Azure-Data-Factory-Cookbook/tree/master/data).
  • Create an Azure SQL Database instance and name it AzureSQLDatabase. When you create the Azure SQL Database instance, you will have the option of creating a server on which the SQL database will be hosted. Create that server and take note of the credentials you entered. You will need these credentials later when you log in to your database.

Choose the basic configuration for your SQL server to save on costs. Once your instance is up and running, configure the Networking settings for the SQL server as highlighted in Figure 2.1. Go to the Networking page under the Security menu, then under Firewall rules, create a rule to allow your IP to access the database. Under Exceptions, make sure that you check the Allow Azure services and resources to access this database option.

Figure 2.1: Firewall configuration

Download the following SQL scripts from GitHub at https://github.com/PacktPublishing/Azure-Data-Factory-Cookbook/tree/master/Chapter02/sql-scripts:

  • CreateAirlineTable.sql and CreateCountryTable.sql: These scripts will add two tables, Country and Airline, which are used in several recipes, including the first one.
  • CreateMetadataTable.sql: This will create the FileMetadata table and a stored procedure to insert data into that table. This table is necessary for the Using Metadata and Stored Procedure activities and Filtering your data and looping through your files recipes.
  • CreateActivityLogsTable.sql: This will create the PipelineLog table and a stored procedure to insert data into that table. This table is necessary for the Chaining and branching activities within your pipeline recipe.
  • CreateEmailRecipients.sql: This script will create the EmailRecipients table and populate it with a record. This table is used in the Using the Lookup, Web, and Execute Pipeline activities recipe. You will need to edit it to enter email recipient information.

To create tables from the downloaded files, open your Azure SQL Database instance, go to the Query editor page, then paste the SQL scripts from the downloaded files and run them one by one.

Now that we’re all set up, let’s move on to the first recipe.

You have been reading a chapter from
Azure Data Factory Cookbook - Second Edition
Published in: Feb 2024
Publisher: Packt
ISBN-13: 9781803246598
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime