Improving performance with Synapse and Fabric
Many data analytics platforms are based on a symmetric multi-processing (SMP) design. This involves a single computer system with one instance of an operating system that has multiple processors, working with shared memory and shared disk arrays. An alternative example is a massively parallel processing (MPP) system. This involves a grid or cluster of computers, each with processors, an operating system, memory, and a disk array. Each server is referred to as a node.
In practical terms, consider computing a sum across 100 billion rows of data. With SMP, a single computer would need to do all the work. With MPP, you could logically allocate the sum of its group in parallel, and then add up the sums. If we wanted the results faster, we could spread the load further with more parallelism, such as by having 50 machines processing about 2 billion rows each. Even with communications and synchronization overhead, the latter approach will be...