Performing Data Exploratory Analysis
Data exploration is much easier from inside Synapse Studio, as it provides easy one-click options to examine various formats of data. You can learn about some of the options available for data exploration using Spark, SQL, and ADF/Synapse pipelines.
Note
This section primarily focuses on the Perform data exploratory analysis concept of the DP-203: Data Engineering on Microsoft Azure exam.
Data Exploration Using Spark
Data exploration is a crucial step in the data analysis process, allowing you to analyze the patterns and correlations within data. Apache Spark, an open source distributed processing system in Azure for data exploration, offers a powerful and scalable approach for handling large datasets efficiently.
Perform the following steps to do so:
- From within the Synapse Studio, right-click on the data file and select the
Load to DataFrame
option, as shown in Figure 4.56:
Figure 4.56 –...