Chapter 3: Processing Data Optimally across Multiple Nodes
In this chapter, we will cover the Synapse SQL architecture components that are required for running data transformation pipelines and leverage the scale-out capabilities to distribute computational data processing and transformation across multiple nodes. Synapse SQL architecture is designed in such a way that the compute is totally separated from storage and, as needed, the compute can be scaled independently of the data. Since compute and data are separated, the queries handled by compute enable massively parallel processing, performance, and greater speed in retrieving the data.
We will cover the following recipes:
- Working with the resource consumption model of Synapse SQL
- Optimizing analytics with dedicated SQL pool and working on data distribution
- Working with serverless SQL pool
- Processing and querying very large datasets
- Script for statistics in Synapse SQL