Designing a distribution strategy
Distribution strategies are techniques that are used in Synapse Dedicated SQL Pools. Synapse Dedicated SQL Pools are massively parallel processing (MPP) systems that split the queries into 60 parallel queries and execute them in parallel. Each of these smaller queries runs on something called a distribution. A distribution is a basic unit of processing and storage for a dedicated SQL pool.
Dedicated SQL uses Azure Storage to store the data, and it provides three different ways to distribute (shard) the data among the distributions. They are listed as follows:
- Round-robin tables
- Hash tables
- Replicated tables
Based on our requirements, we need to decide on which of these distribution techniques should be used for creating our tables. To choose the right distribution strategy, you should understand your application, the data layout, and the data access patterns by using query plans. We will be learning how to generate and read...