Processing and querying very large datasets
Synapse SQL uses distributed query processing, the data movement service, and scale-out architecture, leveraging the advantages of the scalability and flexibility of compute and storage. Data transformation is not required prior to loading it to Synapse SQL. We need to use the built-in massively parallel processing capabilities of Synapse, load data in parallel, and then perform the transformation.
Loading data using PolyBase
external tables and COPY SQL
statements is considered one of the fastest, most reliable, and scalable ways of loading data. We can use external data stored in ADLS and Azure Blob storage, and load data using the COPY
statement. This data is then loaded to production tables and exposed as views, which creates a query view for the client applications to derive meaningful business insights.
Getting ready
We will be performing a series of steps in order to extract, load, and create a materialized view of data for...