Improving performance
We'll look at two performance tweaks for the Power3
class.
First, a better algorithm. Then, a better algorithm combined with memoization, which involves a cache; therefore, the function becomes stateful. This is where callable objects shine.
The first modification is to use a Divide and Conquer design strategy. The previous version chopped into n steps; the loop carried out n individual multiplication operations. If we can find a way to split the problem into two equal portions, the problem decomposes into steps. Given pow1(2,1024)
, the Power1
callable performs the calculation 1024 multiplications by 2. We can optimize this down to 10 multiplications, a significant speedup.
Rather than simply multiplying by a fixed value, we'll use the "fast exponentiation" algorithm. It uses three essential rules for computing , as follows:
- If :, the result is simply 1.
- If n is odd and , the result is . This involves a recursive computation of . This still does a multiplication but...