Understanding threads and concurrency
All high-performance computers today have multiple CPUs or multiple CPU cores (independent processors in a single package). Even most laptop computers have at least two, often four, cores. As we have said many times, in the context of performance, efficiency is not leaving any hardware idle; a program cannot be efficient or high-performing if it uses only a fraction of the computing power, such as one of many CPU cores. There is only one way for a program to use more than one processor at a time: we have to run multiple threads or processes. As a side note, this isn't the only way to use multiple processors for the benefit of the user: very few laptops, for example, are used for high-performance computing. Instead, they use multiple CPUs to better run different and independent programs at the same time. It is a perfectly good use model, just not the one we are interested in in the context of high-performance computing. HPC systems usually...