Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Azure Data Engineering Cookbook

You're reading from   Azure Data Engineering Cookbook Get well versed in various data engineering techniques in Azure using this recipe-based guide

Arrow left icon
Product type Paperback
Published in Sep 2022
Publisher Packt
ISBN-13 9781803246789
Length 608 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Authors (3):
Arrow left icon
Ahmad Osama Ahmad Osama
Author Profile Icon Ahmad Osama
Ahmad Osama
Nagaraj Venkatesan Nagaraj Venkatesan
Author Profile Icon Nagaraj Venkatesan
Nagaraj Venkatesan
Luca Zanna Luca Zanna
Author Profile Icon Luca Zanna
Luca Zanna
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Chapter 1: Creating and Managing Data in Azure Data Lake 2. Chapter 2: Securing and Monitoring Data in Azure Data Lake FREE CHAPTER 3. Chapter 3: Building Data Ingestion Pipelines Using Azure Data Factory 4. Chapter 4: Azure Data Factory Integration Runtime 5. Chapter 5: Configuring and Securing Azure SQL Database 6. Chapter 6: Implementing High Availability and Monitoring in Azure SQL Database 7. Chapter 7: Processing Data Using Azure Databricks 8. Chapter 8: Processing Data Using Azure Synapse Analytics 9. Chapter 9: Transforming Data Using Azure Synapse Dataflows 10. Chapter 10: Building the Serving Layer in Azure Synapse SQL Pool 11. Chapter 11: Monitoring Synapse SQL and Spark Pools 12. Chapter 12: Optimizing and Maintaining Synapse SQL and Spark Pools 13. Chapter 13: Monitoring and Maintaining Azure Data Engineering Pipelines 14. Index 15. Other Books You May Enjoy

Accessing Blob storage accounts using managed identities

In this recipe, we will grant permissions to managed identities on a storage account and showcase how you can use managed identities to connect to Azure Data Lake.

Managed identities are password-less service accounts used by Azure services such as Data Factory and Azure VMs to access other Azure services, such as Blob storage. In this recipe, we will show you how Azure Data Factory's managed identity can be granted permission on an Azure Blob storage account.

Getting ready

Before you start, perform the following steps:

  1. Open a web browser and go to the Azure portal at https://portal.azure.com.
  2. Make sure you have an existing storage account. If not, create one using the Provisioning an Azure storage account using the Azure portal recipe in Chapter 1, Creating and Managing Data in Azure Data Lake.

How to do it…

We will be testing accessing a Data Lake account using managed identities. To achieve this, we will create a Data Factory account and use Data Factory's managed identity to access the Data Lake account. Perform the following steps to test this:

  1. Create an Azure Data Factory by using the following PowerShell command:
    $resourceGroupName = " packtadestorage";
    $location = 'east us'
    $dataFactoryName = "ADFPacktADE2";
    $DataFactory = Set-AzDataFactoryV2 -ResourceGroupName $resourceGroupName -Location $location -Name $dataFactoryName
  2. Go to the storage account in the Azure portal. Click on Access Control (IAM) and then Add, as shown in the following screenshot:
Figure 2.22 – Adding a role to a managed identity

Figure 2.22 – Adding a role to a managed identity

  1. Select Add role assignment and search for the Storage Blob Data Contributor role. Select the role and click Next. Select Managed identity in Assign access to and click on + Select members, as shown in the following screenshot:
Figure 2.23 – Selecting the Data Factory managed identity

Figure 2.23 – Selecting the Data Factory managed identity

  1. Your subscription should be selected by default. From the Managed identity dropdown, select Data Factory (V2) (1). Select the recently created ADFPacktADE2 Data Factory and click on the Select button:
Figure 2.24 – Assigning a role to a managed identity

Figure 2.24 – Assigning a role to a managed identity

  1. Click on Review + Assign to complete the assignment. To test whether it's working, open the ADFPacktADE2 Data Factory that was created in step 1. Click on Open Azure Data Factory Studio, as shown in the next screenshot:
Figure 2.25 – Opening Azure Data Factory Studio

Figure 2.25 – Opening Azure Data Factory Studio

  1. Click on the Manage button on the left and then Linked services. Click on + New, as shown in the following screenshot:
Figure 2.26 – Creating a linked service in Data Factory

Figure 2.26 – Creating a linked service in Data Factory

  1. Search for Data Lake and select Azure Data Lake Storage Gen 2 as the data store. Select Managed Identity for Authentication method. Select the storage account (packadestoragev2) for Storage account name. Click on Test connection:
Figure 2.27 – Testing a managed identity connection in Data Factory

Figure 2.27 – Testing a managed identity connection in Data Factory

A successful test connection indicates that we can successfully connect to a storage account using a managed identity.

How it works…

A managed identity for the data factory was automatically created when the Data Factory account was created. We provided the Storage Blob Data Contributor permission on the Azure Data Lake storage account to the managed identity of Data Factory. Hence, Data Factory was successfully able to connect to the storage account in a secure way without using a key/password.

You have been reading a chapter from
Azure Data Engineering Cookbook - Second Edition
Published in: Sep 2022
Publisher: Packt
ISBN-13: 9781803246789
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime