I hope you've enjoyed the journey so far. We acquired a basic understanding of the matter that we'll put to good use in future chapters of this book. Admittedly, in the second half, the discussion started to be a little low-level, going down into the inner workings of processors, but I hope you picked up at least a couple of buzzwords along the way.
So, we've arrived at the end of this chapter. Looking back, first we learned about benefits and caveats arising when we optimize performance and the pair, premature optimization and pessimization. Then, we discussed the basic rules for performance optimization, the well-known optimization techniques inferred from those rules, how and why memory access patterns matter, the way processors are trying to parallelize work on the instruction level, and, not to forget, what the common performance lingo's buzzwords mean.
Not bad for an introductory chapter, don't you think?
So, after we learned all of those things that we can respond to, the main question of the chapter, namely what's a performant program? The widely acknowledged response is that a performant program is a program that does the following:
- Uses the optimal algorithm for the problem
- Optimizes memory access patterns as to be cache friendly
- Fully uses the parallelization possibilities of the hardware
In the following chapter, we'll look at techniques that allow us to avoid the dreaded premature optimization trap. We can steer clear of this trap by measuring how our code performs and where the bottlenecks and the tight spots are located, before actually starting to optimize. In the next chapter, we'll learn tools and techniques for doing exactly that.