Transformation
Once your data is in a clean format and prepared for further activities such as transformations, you can run the transformation logic using services such as Spark, SQL, and Hive. In this example, as per the certification syllabus, you will use ADB Spark. But first, you need to create an ADB workspace, followed by the ADB cluster that will run the transformations, and then write the Spark code within the Cmd blocks. Perform the following steps to do so:
- Select
Azure Databricks
from the Azure portal and click+Create
to create a new workspace. Figure 5.8 shows the Create an Azure Databricks workspace screen. - Enter the name for the workspace beside the Workspace name option. In this case, it is
DP-203-databricks-workspace
. Then select an option from the Region dropdown. In this case, it isUK South
. - Next, select an option from the Pricing Tier dropdown. In this case, it is
Standard (Apache Spark, Secure with Microsoft Entra ID)
(Figure 5.8):
Figure...