We've seen in this chapter a whole slew of optimization techniques applied either by compilers or left for programmers to use. After such exposure to cool tricks like these, you can forget the big picture, hence it is expedient at this point to restate the basic truths.
Herewith, we (again) state that for performance, only three things really matter: the first is the correct choice of algorithm; the second is correct parallelization and the avoidance of blocking calls; and the third is attention to data locality. The remainder of the tricks, and hence all of the little helpers discussed in this chapter, are potentially useful, but only on a case-by-case basis when a bottleneck has already been identified. Otherwise, we would be sailing in the dangerous waters of premature optimization!
After that periodic reminder, let's recall what we learned in this chapter...