Improving performance
We'll look at two performance tweaks for the Power3
class.
First, a better algorithm. Then, a better algorithm combined with memoization, which involves a cache; therefore, the function becomes stateful. This is where callable objects shine.
The first modification is to use a Divide and Conquer design strategy. The previous version chopped into n steps; the loop carried out n individual multiplication operations. If we can find a way to split the problem into two equal portions, the problem decomposes into
steps. Given
pow1(2,1024)
, the Power1
callable performs the calculation 1024 multiplications by 2. We can optimize this down to 10 multiplications, a significant speedup.
Rather than simply multiplying by a fixed value, we'll use the "fast exponentiation" algorithm. It uses three essential rules for computing , as follows:
-
If
:
, the result is simply 1.
-
If n is odd and
, the result is
. This involves a recursive computation of
. This still does a multiplication but...