Implementing a wrangling data flow
A wrangling data flow performs code-free data preparation at scale by integrating Power Query to prepare/transform data. The Power Query code is converted to Spark and gets executed on a Spark cluster.
In this recipe, we'll implement a wrangling data flow to read the orders.txt
file, clean the data, calculate the total sales by country and customer name, and insert the data into an Azure SQL Database table.
Getting ready
To get started, do the following:
- Log in to https://portal.azure.com using your Azure credentials.
- Open a new PowerShell prompt. Execute the following command to log in to your Azure account from PowerShell:
Connect-AzAccount
- You will need an existing Data Factory account. If you don't have one, create one by executing the
~/azure-data-engineering-cookbook\Chapter04\3_CreatingAzureDataFactory.ps1
PowerShell script. - Create an Azure storage account and upload the files to the
~/Chapter06/Data
folder...