Understanding CPU caching basics, cache effects, and false sharing
Modern processors on multicore symmetric multi-processing (SMP) systems make use of several levels of parallel cache memory within them, in order to provide a very significant speedup when working on memory (we briefly touched upon this in Chapter 8, Kernel Memory Allocation for Module Authors – Part 1, in the Allocating slab memory section). FYI, this kind of computer architecture is often classified as a Multiple Instruction, Single Data (MISD) stream (as instructions can run concurrently in several cores while working upon a single shared data item).
Here’s a purely conceptual diagram (Figure 13.4) showing two CPU cores, each core having two internal caches (Level 1 and Level 2, abbreviated as L1 and L2, respectively), plus a shared or unified L3 cache and the main memory (RAM):
Figure 13.4: Conceptual diagram – 2 CPU cores with internal L1, L2 caches, a shared (unified) L3 cache...