Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Azure Data Engineering Cookbook
Azure Data Engineering Cookbook

Azure Data Engineering Cookbook: Design and implement batch and streaming analytics using Azure Cloud Services

Arrow left icon
Profile Icon Nagaraj Venkatesan Profile Icon Ahmad Osama
Arrow right icon
€20.98 €29.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.2 (12 Ratings)
eBook Apr 2021 454 pages 1st Edition
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon Nagaraj Venkatesan Profile Icon Ahmad Osama
Arrow right icon
€20.98 €29.99
Full star icon Full star icon Full star icon Full star icon Half star icon 4.2 (12 Ratings)
eBook Apr 2021 454 pages 1st Edition
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m
eBook
€20.98 €29.99
Paperback
€36.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

Azure Data Engineering Cookbook

Chapter 2: Working with Relational Databases in Azure

Microsoft Azure provides Azure SQL Database, PostgreSQL, and MySQL as Database-as-a-Service offerings. We can create an instance of these databases without worrying about the installation, administration, infrastructure, and upgrades.

Needless to say that we can install any of the available Relational Database Management System (RDBMS) databases, such as Oracle or DB2, on an Azure virtual machine (VM).

In a data pipeline, we can use any of the RDBMS databases as either a source or a destination. In this chapter, we'll learn how to provision, connect to, manage, maintain, and secure these databases.

We'll cover the following recipes in this chapter:

  • Provisioning and connecting to an Azure SQL database using PowerShell
  • Provisioning and connecting to an Azure PostgreSQL database using the Azure CLI
  • Provisioning and connecting to an Azure MySQL database using the Azure CLI
  • Implementing active geo-replication for an Azure SQL database using PowerShell
  • Implementing an auto-failover group for an Azure SQL database using PowerShell
  • Implementing vertical scaling for an Azure SQL database using PowerShell
  • Implementing an Azure SQL database elastic pool using PowerShell
  • Monitoring an Azure SQL database using the Azure portal

Provisioning and connecting to an Azure SQL database using PowerShell

In this recipe, we'll learn how to create and connect to an Azure SQL Database instance. Azure SQL Database comes in three failovers: standalone Azure SQL Database, Azure SQL Database elastic pools, and managed instances. In this recipe, we'll create a standalone Azure SQL database.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command to log in to your Microsoft Azure account.

How to do it…

Let's begin by provisioning Azure SQL Database.

Provisioning Azure SQL Database

The steps are as follows:

  1. Execute the following PowerShell command to create a new resource group:
    New-AzResourceGroup -Name packtade -Location "central us"
  2. Execute the following query to create a new Azure SQL server:
    #create credential object for the Azure SQL Server admin credential
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    # create the azure sql server
    New-AzSqlServer -ServerName azadesqlserver -SqlAdministratorCredentials $sqladmincredential -Location "central us" -ResourceGroupName packtade

    You should get a similar output as shown in the following screenshot:

    Figure 2.1 – Creating a new Azure SQL server

    Figure 2.1 – Creating a new Azure SQL server

  3. Execute the following query to create a new Azure SQL database:
    New-AzSqlDatabase -DatabaseName azadesqldb -Edition basic -ServerName azadesqlserver -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

Figure 2.2 – Creating a new Azure SQL database

Figure 2.2 – Creating a new Azure SQL database

Connecting to an Azure SQL database

To connect to an Azure SQL database, let's first whitelist the IP in the Azure SQL Server firewall:

  1. Execute the following command to whitelist the public IP of the machine to connect to an Azure SQL database. (This recipe assumes that you are connecting from your local system. To connect from a system other than your local system, change the IP in the following command.) Execute the following command in the PowerShell window to whitelist the machine's public IP in the Azure SQL Server firewall:
    $clientip = (Invoke-RestMethod -Uri https://ipinfo.io/json).ip
    New-AzSqlServerFirewallRule -FirewallRuleName "home" -StartIpAddress $clientip -EndIpAddress $clientip -ServerName azadesqlserver -ResourceGroupName packtade

    You will get an output similar to the one shown in the following screenshot:

    Figure 2.3 – Creating a new Azure SQL Server firewall rule

    Figure 2.3 – Creating a new Azure SQL Server firewall rule

  2. Execute the following command to connect to an Azure SQL database from SQLCMD (SQLCMD comes with the SQL Server installation, or you can download the SQLCMD utility from https://docs.microsoft.com/en-us/sql/tools/sqlcmd-utility?view=sql-server-ver15):
    sqlcmd -S "azadesqlserver.database.windows.net" -U sqladmin -P "Sql@Server@1234" -d azadesqldb

    Here's the output:

Figure 2.4 – Connecting to an Azure SQL database

Figure 2.4 – Connecting to an Azure SQL database

How it works…

We first execute the New-AzSQLServer command to provision a new Azure SQL server. The command accepts the server name, location, resource group, and login credentials. An Azure SQL server, unlike an on-premises SQL server, is not a physical machine or VM that is accessible to customers.

We then execute the New-AzSQLDatabase command to create an Azure SQL database. This command accepts the database name, the Azure SQL server name, the resource group, and the edition. There are multiple SQL database editions to choose from based on the application workload. However, for the sake of this demo, we will create a basic edition.

To connect to an Azure SQL database, we first need to whitelist the machine's IP in the Azure SQL Server firewall. Only whitelisted IPs are allowed to connect to the database. To whitelist the client's public IP, we use the New-AzSQLServerfirewallrule command. This command accepts the server name, resource group, and start and end IPs. We can either whitelist a single IP or a range of IPs.

We can connect to an Azure SQL database from SQL Server Management Studio, SQLCMD, or a programming language using the appropriate SQL Server drivers. When connecting to an Azure SQL database, we need to specify the server name as azuresqlservername.database.windows.net, and then specify the Azure SQL database to connect to.

Provisioning and connecting to an Azure PostgreSQL database using the Azure CLI

Azure Database for PostgreSQL is a Database-as-a-Service offering for the PostgreSQL database. In this recipe, we'll learn how to provision an Azure database for PostgreSQL and connect to it.

Getting ready

We'll be using the Azure CLI for this recipe. Open a new Command Prompt or PowerShell window, and run az login to log in to the Azure CLI.

How to do it…

Let's begin with provisioning a new Azure PostgreSQL server.

Provisioning a new Azure PostrgreSQL server

The steps are as follows:

  1. Execute the following Azure CLI command to create a new resource group:
    az group create --name rgpgressql --location eastus
  2. Execute the following command to create an Azure server for PostgreSQL:
    az postgres server create --resource-group rgpgressql --name adepgresqlserver  --location eastus --admin-user pgadmin --admin-password postgre@SQL@1234 --sku-name B_Gen5_1

    Note

    It may take 10–15 minutes for the server to be created.

  3. Execute the following command to whitelist the IP in the PostgreSQL server firewall:
    $clientip = (Invoke-RestMethod -Uri https://ipinfo.io/json).ip
    az postgres server firewall-rule create --resource-group rgpgressql --server adepgresqlserver --name hostip --start-ip-address $clientip --end-ip-address $clientip

Connecting to an Azure PostgreSQL server

We can connect to an Azure PostgreSQL server using psql or pgadmin (a GUI tool for PostgreSQL management), or from any programming language using a relevant driver.

To connect from psql, execute the following command in a Command Prompt or PowerShell window:

PS C:\Program Files\PostgreSQL\12\bin> .\psql.exe --host=adepgresqlserver.postgres.database.azure.com --port=5432 --username=pgadmin@adepgresqlserver --dbname=postgres

Provide the password and you'll be connected. You should get an output similar to the one shown in the following screenshot:

Figure 2.5 – Connecting to PostgreSQL

Figure 2.5 – Connecting to PostgreSQL

How it works…

To provision a new Azure PostgreSQL server, execute the following Azure CLI command – az postgres server create. We need to specify the server name, resource group, administrator username and password, location, and SKU name parameters. As of now, there are three different SKUs:

To connect to the PostgreSQL server, we first need to whitelist the IP in the server firewall. To do that, we run the az postgres server firewall-rule create Azure CLI command.

We need to provide the firewall rule name, server name, resource group, and start and end IP.

Once the firewall rule is created, the PostgreSQL server can be accessed by any of the utilities, such as psql or pgadmin, or from a programming language. To connect to the server, provide the host or server name as <postgresql server name>.postgres.database.azure.com and the port as 5432. We also need to provide the username and password. If you are connecting for the first time, provide the database name as postgres.

Provisioning and connecting to an Azure MySQL database using the Azure CLI

Azure Database for MySQL is a Database-as-a-Service offering for the MySQL database. In this recipe, we'll learn how to provision an Azure database for MySQL and connect to it.

Getting ready

We'll be using the Azure CLI for this recipe. Open a new Command Prompt or PowerShell window, and run az login to log in to the Azure CLI.

How it works…

Let's see how to provision the Azure MySQL server.

Provisioning the Azure MySQL server

The steps are as follows:

  1. Execute the following command to create a new resource group:
    az group create --name rgmysql --location eastus
  2. Execute the following command to provision a new Azure MySQL server:
    az mysql server create --resource-group rgmysql --name ademysqlserver  --location eastus --admin-user dbadmin --admin-password mySQL@1234 --sku-name B_Gen5_1

    You should get an output as shown in the following screenshot:

Figure 2.6 – Creating an Azure MySQL server

Figure 2.6 – Creating an Azure MySQL server

Connecting to Azure MySQL Server

The steps are as follows:

  1. Execute the following command to whitelist your public IP in the Azure MySQL Server firewall:
    $clientip = (Invoke-RestMethod -Uri https://ipinfo.io/json).ip
    az mysql server firewall-rule create --resource-group rgmysql --server ademysqlserver --name clientIP --start-ip-address $clientip --end-ip-address $clientip

    You should get an output as shown in the following screenshot:

    Figure 2.7 – Creating a firewall rule for the Azure MySQL Server

    Figure 2.7 – Creating a firewall rule for the Azure MySQL Server

  2. We can connect to the Azure MySQL server using the MySQL shell or the MySQL workbench, or from any programming language. To connect from the MySQL shell, execute the following command in a PowerShell window:
    .\mysqlsh.exe -h ademysqlserver.mysql.database.azure.com -u dbadmin@ademysqlserver -p

    Here's the output:

Figure 2.8 – Connecting to the Azure MySQL server

Figure 2.8 – Connecting to the Azure MySQL server

How it works…

To provision a new Azure MySQL server, execute the following Azure CLI command – az mysql server create. We need to specify the server name, resource group, administrator username and password, location, and SKU name parameters. As of now, there are three different SKUs:

  • B_Gen5_1 is the basic and smallest SKU, up to 2 vCores.
  • GP_Gen5_32 is the general-purpose SKU, up to 64 vCores.
  • MO_Gen5_2 is the memory-optimized SKU, with 32 memory-optimized vCores.

To connect to the MySQL server, we first need to whitelist the IP in the server firewall. To do that, we run the az mysql server firewall-rule create Azure CLI command.

We need to provide the firewall rule name, server name, resource group, and start and end IPs.

Once the firewall rule is created, the MySQL server can be accessed by any of the utilities, such as the MySQL command line or the MySQL workbench, or from a programming language. To connect to the server, provide the host or server name as <mysql server name>.mysql.database.azure.com. We also need to provide the username and password.

Implementing active geo-replication for an Azure SQL database using PowerShell

The active geo-replication feature allows you to create up to four readable secondaries of a primary Azure SQL database. Active geo-replication uses SQL Server AlwaysOn to asynchronously replicate transactions to the secondary databases. The secondary database can be in the same or a different region than the primary database.

Active geo-replication can be used for the following cases:

  • To provide business continuity by failing over to the secondary database in case of a disaster. The failover is manual.
  • To offload reads to the readable secondary.
  • To migrate a database to a different server in another region.

In this recipe, we'll configure active geo-replication for an Azure SQL database and perform a manual failover.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

You need an existing Azure SQL database for this recipe. If you don't have one, create an Azure SQL database by following the steps mentioned in the Provisioning and connecting to an Azure SQL database using PowerShell recipe.

How to do it…

First, let's create a readable secondary.

Creating a readable secondary

The steps are as follows:

  1. Execute the following command to provision a new Azure SQL server to host the secondary replica:
    #create credential object for the Azure SQL Server admin credential
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    New-AzSQLServer -ServerName azadesqlsecondary -SqlAdministratorCredentials $sqladmincredential -Location westus -ResourceGroupName packtade
  2. Execute the following command to configure the geo-replication from the primary server to the secondary server:
    $primarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlserver -ResourceGroupName packtade
    $primarydb | New-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary -AllowConnections "All"

    You should get an output as shown in the following screenshot:

Figure 2.9 – Configuring geo-replication

Figure 2.9 – Configuring geo-replication

Moreover, we can also check this on the Azure portal, as shown in the following screenshot:

Figure 2.10 – Verifying geo-replication in the Azure portal

Figure 2.10 – Verifying geo-replication in the Azure portal

Performing manual failover to the secondary

The steps are as follows:

  1. Execute the following command to manually failover to the secondary database:
    $secondarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlsecondary -ResourceGroupName packtade
    $secondarydb | Set-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -Failover

    The preceding command performs a planned failover without data loss. To perform a manual failover with data loss, use the Allowdataloss switch.

    If we check the Azure portal, we'll see that azadesqlsecondary/azadesqldb in West US is the primary database:

    Figure 2.11 – Failing over to the secondary server

    Figure 2.11 – Failing over to the secondary server

  2. We can also get the active geo-replication information by executing the following command:
    Get-AzSqlDatabaseReplicationLink -DatabaseName azadesqldb -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary -ServerName azadesqlserver -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

Figure 2.12 – Getting the geo-replication status

Figure 2.12 – Getting the geo-replication status

Removing active geo-replication

Execute the following command to remove the active geo-replication link between the primary and the secondary databases:

$primarydb = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlserver -ResourceGroupName packtade
$primarydb | Remove-AzSqlDatabaseSecondary -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary

You should get an output as shown in the following screenshot:

Figure 2.13 – Removing active geo-replication

Figure 2.13 – Removing active geo-replication

How it works…

To configure active geo-replication, we use the New-AzSqlDatabaseSecondary command. This command expects the primary database name, server name, and resource group name, and the secondary resource group name, server name, and the Allow connections parameter. If we want a readable secondary, then we set Allow connections to All; otherwise, we set it to No.

The active geo-replication provides manual failover with and without data loss. To perform a manual failover, we use the Set-AzSqlDatabaseSecondary command. This command expects the secondary server name, database name, resource group name, a failover switch, and the Allowdataloss switch in case of failover with data loss.

To remove active geo-replication, we use the Remove-AzSqlDatabaseSecondary command. This command expects the secondary server name, secondary database name, and resource name to remove the replication link between the primary and the secondary database.

Removing active geo-replication doesn't remove the secondary database.

Implementing an auto-failover group for an Azure SQL database using PowerShell

An auto-failover group allows a group of databases to fail to a secondary server in another region in case the SQL database service in the primary region fails. Unlike active geo-replication, the secondary server should be in a different region than the primary. The secondary databases can be used to offload read workloads.

The failover can be manual or automatic.

In this recipe, we'll create an auto-failover group, add databases to the auto-failover group, and perform a manual failover to the secondary server.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

You will need an existing Azure SQL database for this recipe. If you don't have one, create an Azure SQL database by following the steps mentioned in the Provisioning and connecting to an Azure SQL database using PowerShell recipe.

How to do it…

First, let's create an auto-failover group.

Creating an auto-failover group

The steps are as follows:

  1. Execute the following PowerShell command to create a secondary server. The server should be in a different region than the primary server:
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    New-AzSQLServer -ServerName azadesqlsecondary -SqlAdministratorCredentials $sqladmincredential -Location westus -ResourceGroupName packtade
  2. Execute the following command to create the auto-failover group:
    New-AzSqlDatabaseFailoverGroup -ServerName azadesqlserver -FailoverGroupName adefg -PartnerResourceGroupName packtade -PartnerServerName azadesqlsecondary -FailoverPolicy Automatic -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

    Figure 2.14 – Creating an auto-failover group

    Figure 2.14 – Creating an auto-failover group

  3. Execute the following command to add an existing database in the auto-failover group:
    $db = Get-AzSqlDatabase -DatabaseName azadesqldb -ServerName azadesqlserver -ResourceGroupName packtade
    $db | Add-AzSqlDatabaseToFailoverGroup -FailoverGroupName adefg
  4. Execute the following command to add a new Azure SQL database to the auto-failover group:
    $db = New-AzSqlDatabase -DatabaseName azadesqldb2 -Edition basic -ServerName azadesqlserver -ResourceGroupName packtade
    $db | Add-AzSqlDatabaseToFailoverGroup -FailoverGroupName adefg
  5. Execute the following PowerShell command to get the details about the auto-failover group:
    Get-AzSqlDatabaseFailoverGroup -ServerName azadesqlserver -FailoverGroupName adefg -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

    Figure 2.15 – Getting the auto-failover group details

    Figure 2.15 – Getting the auto-failover group details

    The endpoint used to connect to the primary server of an auto-failover group is of the form <auto-failover group name>.database.windows.net. In our case, this will be adefg.database.windows.net.

    To connect to a readable secondary in an auto-failover group, the endpoint used is of the form <auto-failover group name>.secondary.database.windows.net. In our case, the endpoint will be adefg.secondary.database.windows.net. In addition to this, we need to specify ApplicationIntent as readonly in the connection string when connecting to the readable secondary.

  6. In an Azure portal, the failover groups can be found on the Azure SQL server page, as shown in the following screenshot:
    Figure 2.16 – Viewing an auto-failover group in the Azure portal

    Figure 2.16 – Viewing an auto-failover group in the Azure portal

  7. To open the failover group details, click the failover group name, adefg:
Figure 2.17 – Viewing the auto-failover group details in the Azure portal

Figure 2.17 – Viewing the auto-failover group details in the Azure portal

Performing a manual failover to the secondary server

The steps are as follows:

  1. Execute the following command to manually failover to the secondary server:
    $secondarysqlserver = Get-AzSqlServer -ResourceGroupName packtade -ServerName azadesqlsecondary
    $secondarysqlserver | Switch-AzSqlDatabaseFailoverGroup -FailoverGroupName adefg

    If we check in the Azure portal, the primary server is now azadesqlsecondary and the secondary server is azadesqlserver, as shown in the following screenshot:

    Figure 2.18 – Manual failover to the secondary server

    Figure 2.18 – Manual failover to the secondary server

  2. Execute the following command to remove the auto-failover group. Removing the auto-failover group doesn't remove the secondary or primary SQL databases:
    Remove-AzSqlDatabaseFailoverGroup -ServerName azadesqlsecondary -FailoverGroupName adefg -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

Figure 2.19 – Removing the auto-failover group

Figure 2.19 – Removing the auto-failover group

How it works…

The New-AzSqlDatabaseFailoverGroup command is used to create an auto-failover group. We need to specify the auto-failover group name, the primary and secondary server names, the resource group name, and the failover policy (automatic/manual). In addition to this, we can also specify GracePeriodWithDataLossHours. As the replication between the primary and secondary is synchronous, the failover may result in data loss. The GracePeriodwithDataLossHours value specifies how many hours the system should wait before initiating the automatic failover. This can, therefore, limit the data loss that can happen because of a failover.

After the auto-failover group creation, we can add the databases to the auto-failover group by using the Add-AzSqlDatabaseToFailoverGroup command. The database to be added should exist on the primary server and not on the secondary server.

We can perform a manual failover by executing the Switch-AzSqlDatabaseFailoverGroup command. We need to provide the primary server name, the auto-failover group name, and the primary server resource group name.

To remove an auto-failover group, execute the Remove-AzSqlDatabaseFailoverGroup command by specifying the primary server name and resource group and the auto-failover group name.

Implementing vertical scaling for an Azure SQL database using PowerShell

An Azure SQL database has multiple purchase model and service tiers for different workloads. There are two purchasing models: DTU-based and vCore-based. There are multiple service tiers within the purchasing models.

Having multiple service tiers gives the flexibility to scale up or scale down based on the workload or activity in an Azure SQL database.

In this recipe, we'll learn how to automatically scale up an Azure SQL database whenever the CPU percentage is above 40%.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

You will need an existing Azure SQL database for this recipe. If you don't have one, create an Azure SQL database by following the steps mentioned in the Provisioning and connecting to an Azure SQL database using PowerShell recipe.

How to do it…

The steps for this recipe are as follows:

  1. Execute the following PowerShell command to create an Azure Automation account:
    #Create an Azure automation account
    $automation = New-AzAutomationAccount -ResourceGroupName packtade -Name adeautomate -Location centralus -Plan Basic
  2. Execute the following command to create an Automation runbook of the PowerShell workflow type:
    #Create a new automation runbook of type PowerShell workflow
    $runbook = New-AzAutomationRunbook -Name rnscalesql -Description "Scale up sql azure when CPU is 40%" -Type PowerShellWorkflow -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName
  3. Execute the following command to create Automation credentials. The credentials are passed as a parameter to the runbook and are used to connect to the Azure SQL database from the runbook:
    #Create automation credentials.
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    $creds = New-AzAutomationCredential -Name sqlcred -Description "sql azure creds" -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName -Value $sqladmincredential
  4. The next step is to edit the runbook and PowerShell to modify the service tier of an Azure SQL database. To do that, open https://portal.azure.com and log in to your Azure account. Under All resources, search for and open the adeautomate automation account:
    Figure 2.20 – Opening the Azure Automation account

    Figure 2.20 – Opening the Azure Automation account

  5. On the Azure Automation page, locate and select Runbooks:
    Figure 2.21 – Opening the runbook in Azure Automation

    Figure 2.21 – Opening the runbook in Azure Automation

  6. Select the rnscalesql runbook to open the runbook page. On the runbook page, select Edit:
    Figure 2.22 – Editing the runbook in your Azure Automation account

    Figure 2.22 – Editing the runbook in your Azure Automation account

  7. On the Edit PowerShell Workflow Runbook page, copy and paste the following PowerShell code onto the canvas:
    workflow rnscalesql
    { 
        param 
        ( 
            # Name of the Azure SQL Database server (Ex: bzb98er9bp) 
            [parameter(Mandatory=$true)]  
            [string] $SqlServerName, 
            # Target Azure SQL Database name  
            [parameter(Mandatory=$true)]  
            [string] $DatabaseName,  
            # When using in the Azure Automation UI, please enter the name of the credential asset for the "Credential" parameter 
            [parameter(Mandatory=$true)]  
            [PSCredential] $Credential 
        ) 
        inlinescript 
        { 
            $ServerName = $Using:SqlServerName + ".database.windows.net"
            $db = $Using:DatabaseName
            $UserId = $Using:Credential.UserName 
            $Password = ($Using:Credential).GetNetworkCredential().Password 
            $ServerName
            $db
            $UserId
            $Password
            $MasterDatabaseConnection = New-Object System.Data.SqlClient.SqlConnection 
            $MasterDatabaseConnection.ConnectionString = "Server = $ServerName; Database = Master; User ID = $UserId; Password = $Password;" 
            $MasterDatabaseConnection.Open(); 
            $MasterDatabaseCommand = New-Object System.Data.SqlClient.SqlCommand 
            $MasterDatabaseCommand.Connection = $MasterDatabaseConnection 
            $MasterDatabaseCommand.CommandText =  
                " 
                    ALTER DATABASE $db MODIFY (EDITION = 'Standard', SERVICE_OBJECTIVE = 'S0');
                    
                "        
            $MasterDbResult = $MasterDatabaseCommand.ExecuteNonQuery();
        } 
    }

    The preceding code modifies the service tier of the given Azure SQL database to Standard S0.

  8. Click Save, and then click Publish to publish the runbook:
    Figure 2.23 – Saving and publishing the runbook

    Figure 2.23 – Saving and publishing the runbook

  9. The next step is to create a webhook to trigger the runbook. Execute the following command to create the webhook:
    # define the runbook parameters
    $Params = @{"SQLSERVERNAME"="azadesqlserver";"DATABASENAME"="azadesqldb";"CREDENTIAL"="sqlcred"}
    # Create a webhook
    $expiry = (Get-Date).AddDays(1)  
    New-AzAutomationWebhook -Name whscaleazure -RunbookName $runbook.Name -Parameters $Params -ResourceGroupName packtade -AutomationAccountName $automation.AutomationAccountName -IsEnabled $true -ExpiryTime $expiry

    Note

    When defining $Params, you may want to change the default values mentioned here if you have a different Azure SQL server, database, and cred values.

    You should get an output as shown in the following screenshot:

    Figure 2.24 – Creating a webhook

    Figure 2.24 – Creating a webhook

    Copy and save the WebhookURI value for later use.

  10. The next step is to create an alert for an Azure SQL database that when triggered will call the webhook URI. Execute the following query to create an alert action group receiver:
    #Create action group reciever
    $whr = New-AzActionGroupReceiver -Name agrscalesql -WebhookReceiver -ServiceUri "https://s25events.azure-automation.net/webhooks?token=NfL30nj%2fkuSo8TTT7CqDwRI WEdeXR1lklkK%2fzgELCiY%3d"

    Note

    Replace the value of the ServiceUri parameter with your webhook URI from the previous step.

  11. Execute the following query to create an action group with an action receiver as defined by the preceding command:
    #Create a new action group.
    $ag = Set-AzActionGroup -ResourceGroupName packtade -Name ScaleSQLAzure -ShortName scaleazure -Receiver $whr
  12. Execute the following query to create an alert condition to trigger the alert:
    #define the alert trigger condition
    $condition = New-AzMetricAlertRuleV2Criteria  -MetricName "cpu_percent" -TimeAggregation maximum -Operator greaterthan -Threshold 40 -MetricNamespace "Microsoft.Sql/servers/databases"

    The condition defines that the alert should trigger when the metric CPU percentage is greater than 40%.

  13. Execute the following query to create an alert on the Azure SQL database:
    #Create the alert with the condition and action defined in previous steps.
    $rid = (Get-AzSqlDatabase -ServerName azadesqlserver -ResourceGroupName packtade -DatabaseName azadesqldb).Resourceid
    Add-AzMetricAlertRuleV2 -Name monitorcpu -ResourceGroupName packtade -WindowSize 00:01:00 -Frequency 00:01:00 -TargetResourceId $rid -Condition $condition  -Severity 1 -ActionGroupId $ag.id

    You should get an output as shown in the following screenshot:

    Figure 2.25 – Creating the alert

    Figure 2.25 – Creating the alert

    The preceding command creates an Azure SQL database alert. The alert is triggered when the cpu_percent metric is greater than 40% for more than 1 minute. When the alert is triggered, as defined in the action group, the webhook is called. The webhook in turn runs the runbook. The runbook modifies the service tier of the database to Standard S0.

  14. To see the alert in action, connect to the Azure SQL database and execute the following query to simulate high CPU usage:
    --query to simulate high CPU usage
    While(1=1)
    Begin
    Select cast(a.object_id as nvarchar(max)) from sys.objects a, sys.objects b,sys.objects c, sys.objects d
    End

    As soon as the alert condition is triggered, the webhook is called and the database service tier is modified to Standard S0.

How it works…

To configure automatic scaling for an Azure SQL database, we create an Azure Automation runbook. The runbook specifies the PowerShell code to modify the service tier of an Azure SQL database.

We create a webhook to trigger the runbook. We create an Azure SQL database alert and define the alert condition to trigger when the cpu_percent metric is greater than 40% for at least 1 minute. In the alert action, we call the webhook defined earlier.

When the alert condition is reached, the webhook is called, which in turn executes the runbook, resulting in the Azure SQL database service tier change.

Implementing an Azure SQL database elastic pool using PowerShell

An elastic pool is a cost-effective mechanism to group single Azure SQL databases of varying peak usage times. For example, consider 20 different SQL databases with varying usage patterns, each Standard S3 requiring 100 database throughput units (DTUs) to run. We need to pay for 100 DTUs separately. However, we can group all of them in an elastic pool of Standard S3. In this case, we only need to pay for elastic pool pricing and not for each individual SQL database.

In this recipe, we'll create an elastic pool of multiple single Azure databases.

Getting ready

In a new PowerShell window, execute the Connect-AzAccount command and follow the steps to log in to your Azure account.

How it works…

The steps for this recipe are as follows:

  1. Execute the following query on an Azure SQL server:
    #create credential object for the Azure SQL Server admin credential
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    # create the azure sql server
    New-AzSqlServer -ServerName azadesqlserver -SqlAdministratorCredentials $sqladmincredential -Location "central us" -ResourceGroupName packtade
    Execute the following query to create an elastic pool.
    #Create an elastic pool
    New-AzSqlElasticPool -ElasticPoolName adepool -ServerName azadesqlserver -Edition standard -Dtu 100 -DatabaseDtuMin 20 -DatabaseDtuMax 100 -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

    Figure 2.26 – Creating a new Azure elastic pool

    Figure 2.26 – Creating a new Azure elastic pool

  2. Execute the following query to create and add an Azure SQL database to an elastic pool:
    #Create a new database in elastic pool
    New-AzSqlDatabase -DatabaseName azadedb1 -ElasticPoolName adepool -ServerName azadesqlserver -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

    Figure 2.27 – Creating a new SQL database in an elastic pool

    Figure 2.27 – Creating a new SQL database in an elastic pool

  3. Execute the following query to create a new Azure SQL database outside of the elastic pool:
    #Create a new database outside of an elastic pool
    New-AzSqlDatabase -DatabaseName azadedb2 -Edition Standard -RequestedServiceObjectiveName S3 -ServerName azadesqlserver -ResourceGroupName packtade

    You should get an output as shown in the following screenshot:

    Figure 2.28 – Creating a new SQL database

    Figure 2.28 – Creating a new SQL database

  4. Execute the following query to add the adesqldb2 database created in the preceding step to the elastic pool:
    #Add an existing database to the elastic pool
    $db = Get-AzSqlDatabase -DatabaseName azadedb2 -ServerName azadesqlserver -ResourceGroupName packtade
    $db | Set-AzSqlDatabase -ElasticPoolName adepool

    You should get an output as shown in the following screenshot:

    Figure 2.29 – Adding an existing SQL database to an elastic pool

    Figure 2.29 – Adding an existing SQL database to an elastic pool

  5. To verify this in the Azure portal, log in with your Azure account. Navigate to All resources | azadesqlserver | SQL elastic pools | Configure:
    Figure 2.30 – Viewing the elastic pool in the Azure portal

    Figure 2.30 – Viewing the elastic pool in the Azure portal

  6. Execute the following command to remove an Azure SQL database from an elastic pool. To move a database out of an elastic pool, we need to set the edition and the service objective explicitly:
    #remove a database from an elastic pool
    $db = Get-AzSqlDatabase -DatabaseName azadesqldb2 -ServerName azadesqlserver -ResourceGroupName packtade
    $db | Set-AzSqlDatabase -Edition Standard -RequestedServiceObjectiveName S3

    You should get an output as shown in the following screenshot:

    Figure 2.31 – Removing a SQL database from an elastic pool

    Figure 2.31 – Removing a SQL database from an elastic pool

  7. Execute the command that follows to remove an elastic pool. An elastic pool should be empty before it can be removed. Execute the following query to remove all of the databases in an elastic pool:
    # get elastic pool object 
    $epool = Get-AzSqlElasticPool -ElasticPoolName adepool -ServerName azadesqlserver -ResourceGroupName packtade
    # get all databases in an elastic pool
    $epdbs = $epool | Get-AzSqlElasticPoolDatabase
    # change the edition of all databases in an elastic pool to standard S3
    foreach($db in $epdbs) {
    $db | Set-AzSqlDatabase -Edition Standard -RequestedServiceObjectiveName S3
    }
    # Remove an elastic pool 
    $epool | Remove-AzSqlElasticPool

    Note

    The command sets the edition of the SQL databases to Standard. This is for demo purposes only. If this is to be done on production, modify the edition and service objective accordingly.

How it works…

We create an elastic pool using the New-AzSqlElasticPool command. In addition to the parameters, such as the server name, resource group name, compute model, compute generation, and edition, which are the same as when we create a new Azure SQL database, we can specify DatabaseMinDtu and DatabaseMaxDtu. DatabaseMinDtu specifies the minimum DTU that all databases can have in an elastic pool. DatabaseMaxDtu is the maximum DTU that a database can consume in an elastic pool.

Similarly, for the vCore-based purchasing model, we can specify DatabaseVCoreMin and DatabaseVCoreMax.

To add a new database to an elastic pool, specify the elastic pool name at the time of database creation using the New-AzSqlDatabase command.

To add an existing database to an elastic pool, modify the database using Set-AzSqlDatabase to specify the elastic pool name.

To remove a database from an elastic pool, modify the database using the Set-AzSqlDatabase command to specify a database edition explicitly.

To remove an elastic pool, first empty it by moving out all of the databases from the elastic pool, and then remove it using the Remove-AzSqlElasticPool command.

Monitoring an Azure SQL database using the Azure portal

Azure SQL Database has built-in monitoring features, such as query performance insights, performance overview, and diagnostic logging. In this recipe, we'll learn how to use the monitoring capabilities using the Azure portal.

Getting ready

We'll use PowerShell to create an Azure SQL database, so open a PowerShell window and log in to your Azure account by executing the Connect-AzAccount command.

We'll use the Azure portal to monitor the Azure SQL database. Open https://portal.azure.com and log in to your Azure account.

How to do it…

First, let's execute a sample workload.

Creating an Azure SQL database and executing a sample workload

The steps are as follows:

  1. Execute the following PowerShell command to create an Azure SQL database with the AdventureWorksLT sample database:
    # create the resource group
    New-AzResourceGroup -Name packtade -Location "central us" -force
    #create credential object for the Azure SQL Server admin credential
    $sqladminpassword = ConvertTo-SecureString 'Sql@Server@1234' -AsPlainText -Force
    $sqladmincredential = New-Object System.Management.Automation.PSCredential ('sqladmin', $sqladminpassword)
    # create the azure sql server
    New-AzSqlServer -ServerName azadesqlserver -SqlAdministratorCredentials $sqladmincredential -Location "central us" -ResourceGroupName packtade
    #Create the SQL Database
    New-AzSqlDatabase -DatabaseName adeawlt -Edition basic -ServerName azadesqlserver -ResourceGroupName packtade -SampleName AdventureWorksLT
  2. Execute the following command to add the client IP to the Azure SQL Server firewall:
    $clientip = (Invoke-RestMethod -Uri https://ipinfo.io/json).ip
    New-AzSqlServerFirewallRule -FirewallRuleName "home" -StartIpAddress $clientip -EndIpAddress $clientip -ServerName azadesqlserver -ResourceGroupName packtade
  3. Execute the following command to run a workload against the Azure SQL database:
    sqlcmd -S azadesqlserver.database.windows.net -d adeawlt -U sqladmin -P Sql@Server@1234 -i "C:\ADECookbook\Chapter02\workload.sql" > "C:\ADECookbook\Chapter02\workload_output.txt"

    It can take 4–5 minutes for the workload to complete. You can execute the preceding command multiple times; however, you should run it at least once.

Monitoring Azure SQL database metrics

The steps are as follows:

  1. In the Azure portal, navigate to All resources | azadesqlserver | the adeawlt database. Search for and open Metrics:
    Figure 2.32 – Opening the Metrics section in the Azure portal

    Figure 2.32 – Opening the Metrics section in the Azure portal

    The Metrics page allows you to monitor different available metrics over time.

  2. To select metrics, click Add metric | CPU percentage | Data IO percentage:
    Figure 2.33 – Monitoring metrics for a SQL database

    Figure 2.33 – Monitoring metrics for a SQL database

    We can select the metrics we are interested in monitoring and use the Pin to dashboard feature to pin the chart to the portal dashboard. We can also create an alert rule from the metrics page by clicking on New alert rule. We can select a time range to drill down to specific times or investigates spikes in the chart.

  3. To select a time range, select the Time range dropdown in the top-right corner of the Metrics page and select the desired time range:
Figure 2.34 – Selecting a time range to monitor

Figure 2.34 – Selecting a time range to monitor

Using Query Performance Insight to find resource-consuming queries

Query Performance Insight is an intelligent performance feature that allows us to find any resource-consuming and long-running queries. The steps are as follows:

  1. In the Azure portal, navigate to All resources | azadesqlserver | the adeawlt database. Find and open Query Performance Insight:
    Figure 2.35 – Selecting Query Performance Insight for the SQL database

    Figure 2.35 – Selecting Query Performance Insight for the SQL database

  2. On the Query Performance Insight page, observe that there are three tabs: Resource Consuming Queries, Long Running Queries, and Custom. We can select resource-consuming queries by CPU, Data IO, and Log IO:
    Figure 2.36 – Monitoring queries for the SQL database

    Figure 2.36 – Monitoring queries for the SQL database

    Resource Consuming Queries lists out the top three queries by CPU consumption. We can also select the top three queries by Data IO and Log IO. The bottom of the page lists out the color-coded queries.

  3. To get the query text, click on the color-coded box:
    Figure 2.37 – Viewing the query details

    Figure 2.37 – Viewing the query details

    We can look at the query text and optimize it for better performance.

  4. The Custom tab allows us to select resource-consuming queries by duration and execution count. We can also specify a custom time range, the number of queries, and the query and metric aggregation:
    Figure 2.38 – Providing custom monitoring configuration

    Figure 2.38 – Providing custom monitoring configuration

  5. Select the options and click the Go button to refresh the chart as per the custom settings. Long running queries lists out the top three queries by duration:
Figure 2.39 – Viewing the long-running queries list

Figure 2.39 – Viewing the long-running queries list

We can further look into the query text and other details by selecting the query ID.

Monitoring an Azure SQL database using diagnostic settings

In addition to metrics and query performance insight, we can also monitor an Azure SQL database by collecting diagnostic logs. The diagnostic logs can be sent to the Log Analytics workspace or Azure Storage, or can be streamed to Azure Event Hubs. The steps are as follows:

  1. To enable diagnostic logging using the Azure portal, navigate to All resources | azadesqlserver | adeawlt. Find and open Diagnostic settings:
    Figure 2.40 – Diagnostic settings

    Figure 2.40 – Diagnostic settings

  2. Click on Add diagnostic setting to add a new diagnostic setting.
  3. Select the categories to be included in the logs and their destination:
    Figure 2.41 – Selecting categories

    Figure 2.41 – Selecting categories

  4. Click Save to create the new diagnostic setting. The diagnostic logs can be analyzed in the Log Analytics workspace.

    Note:

    Diagnostic setting adds an additional cost to the Azure SQL database. It may take some time for the logs to be available after creating a new diagnostic setting.

Automatic tuning in an Azure SQL database

Automatic tuning provides three features: force plan, create, and drop indexes. Automatic tuning can be enabled for an Azure SQL server, in which case it's applied to all of the databases in that Azure SQL server. Automatic tuning can be enabled for individual Azure SQL databases as well. The steps are as follows:

  1. To enable automatic tuning, in the Azure portal, navigate to All resources | azadesqlserver | adewlt. Find and select Automatic Tuning under Intelligent Performance:
    Figure 2.42 – Automatic tuning in the SQL database

    Figure 2.42 – Automatic tuning in the SQL database

  2. Enable the CREATE INDEX tuning option by clicking ON under the Desired state option.
  3. Click Apply to save the configuration.

    Note

    It may take time for recommendations to show up.

    The recommendations will show up in the performance recommendations under the Intelligent Performance section.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Build highly efficient ETL pipelines using the Microsoft Azure Data services
  • Create and execute real-time processing solutions using Azure Databricks, Azure Stream Analytics, and Azure Data Explorer
  • Design and execute batch processing solutions using Azure Data Factory

Description

Data engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis. It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You’ll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer. By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.

Who is this book for?

This book is for Data Engineers, Database administrators, Database developers, and extract, load, transform (ETL) developers looking to build expertise in Azure Data engineering using a recipe-based approach. Technical architects and database architects with experience in designing data or ETL applications either on-premise or on any other cloud vendor who wants to learn Azure Data engineering concepts will also find this book useful. Prior knowledge of Azure fundamentals and data engineering concepts is needed.

What you will learn

  • Use Azure Blob storage for storing large amounts of unstructured data
  • Perform CRUD operations on the Cosmos Table API
  • Implement elastic pools and business continuity with Azure SQL Database
  • Ingest and analyze data using Azure Synapse Analytics
  • Develop Data Factory data flows to extract data from multiple sources
  • Manage, maintain, and secure Azure Data Factory pipelines
  • Process streaming data using Azure Stream Analytics and Data Explorer

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Apr 05, 2021
Length: 454 pages
Edition : 1st
Language : English
ISBN-13 : 9781800201545
Vendor :
Microsoft
Category :
Languages :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning

Product Details

Publication date : Apr 05, 2021
Length: 454 pages
Edition : 1st
Language : English
ISBN-13 : 9781800201545
Vendor :
Microsoft
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 115.97
Azure Data Factory Cookbook
€36.99
Azure Databricks Cookbook
€41.99
Azure Data Engineering Cookbook
€36.99
Total 115.97 Stars icon

Table of Contents

10 Chapters
Chapter 1: Working with Azure Blob Storage Chevron down icon Chevron up icon
Chapter 2: Working with Relational Databases in Azure Chevron down icon Chevron up icon
Chapter 3: Analyzing Data with Azure Synapse Analytics Chevron down icon Chevron up icon
Chapter 4: Control Flow Activities in Azure Data Factory Chevron down icon Chevron up icon
Chapter 5: Control Flow Transformation and the Copy Data Activity in Azure Data Factory Chevron down icon Chevron up icon
Chapter 6: Data Flows in Azure Data Factory Chevron down icon Chevron up icon
Chapter 7: Azure Data Factory Integration Runtime Chevron down icon Chevron up icon
Chapter 8: Deploying Azure Data Factory Pipelines Chevron down icon Chevron up icon
Chapter 9: Batch and Streaming Data Processing with Azure Databricks Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Most Recent
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.2
(12 Ratings)
5 star 41.7%
4 star 41.7%
3 star 8.3%
2 star 8.3%
1 star 0%
Filter icon Filter
Most Recent

Filter reviews by




Amazon Customer Jul 18, 2022
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I thought this book was great. There is so much you can do in azure it can definitely get overwhelming. This book did a good job bringing all the azure tooling together and showing how they work hand and hand with one another.
Amazon Verified review Amazon
Amanpreet K. Oct 13, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Really good descriptive book ….
Amazon Verified review Amazon
Ashok Boddeda Sep 15, 2021
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
This book covers ETL on Azure but most of the focus areas is only Datafactory and Azure Databricks are vastly ignored.
Amazon Verified review Amazon
Steef-Jan Jul 13, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
Data engineering is all about data, data pipelines, and data analytics and requires tooling. Public Cloud providers like AWS, Microsoft, and Google offer a myriad of services for data engineering. For example, Microsoft provides data engineers several managed services on their Azure platform like Storage (Data storage), Data Factory (Data Pipeline), Synapse (SQL Warehouse solution), Databricks (ETL), and Relational Databases (SQL Azure, PostgreSQL, and MySQL). Currently, there are various books on the market leveraging the services in Azure.I received the Azure Data Engineering Cookbook from a Packt representative. As the title suggests, it’s a cookbook providing the reader with recipes (how-to’s) when working with Azure Storage Accounts, Data Factory, Databricks, Synapse Analytics, and Relational Database. The author divides the recipes for the various services into several chapters. Each chapter starts with the prerequisites (Technical Requirements) necessary to follow the recipe. Next are the steps following the recipe (How to do it …) and closed by what happens (some context around the process/recipe the reader follows called How it works… ).Unfortunately, the latter is sometimes just a short sentence or two. The author should elaborate more on what happens and why (for instance, what happens in Azure/datacenter itself). Furthermore, I miss further readings and references to Microsoft documentation.The book contains many helpful PowerShell scripts and distinctive screenshots to make the recipes easy to follow. Furthermore, the emphasis is on the Data Factory service in particular, with multiple chapters about it. Data Factory is the critical service in Azure to build data pipelines to facilitate data integrations and transformations. Yet, other services matter too, and a bit more Databricks and Data Lake would have made the book a little more valuable.In my view, the book is an excellent starting point for those unfamiliar with Azure Data Engineering offerings on the Azure platform and who want to invest time to learn the concepts – not only by reading but also by getting their hands dirty. Note that the guidance provided in the recipes shows how to do things in the portal and/or using PowerShell scripts – it doesn’t have anything with, for instance, Azure Pipelines to deploy the resources with yaml.
Amazon Verified review Amazon
Ashwini Jun 30, 2021
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
A great book for Data Engineers. The book covers all the data engineering topics of how to use azure tools in Azure. The lucid language and code examples make this book easy to understand and implement in day to day life of Data Engineers.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.