Summary
In this chapter, we have learned about the computing capabilities of the main processor and how to use them effectively. The key to high performance is to make maximum use of all available computing resources: a program that computes two results at the same time is faster than the one that computes the second result later (assuming the computing power is available). As we have learned, the CPU has a lot of computing units for various types of computations, most of which are idle at any given moment unless the program is very highly optimized.
We have seen that the main restriction on efficient use of the CPU's instruction-level parallelism is usually the data dependencies: there simply isn't enough work that can be done in parallel to keep the CPU busy. The hardware solution to this problem is pipelining: the CPU doesn't just execute the code at the current point in the program but takes some computations from the future that have no unsatisfied data dependencies...