Measuring dot product performance
Now that we know how to measure the cycle count of an important section of code, let’s give it a try by measuring the dot product performance on the Raspberry Pi Pico. We will look at multiple implementations of the dot product and experiment with compiler optimizations to see how the implementation of the dot product impacts performance.
Using the Raspberry Pi Pico
Often, a project already has a Cortex-M microcontroller chosen, which cannot be changed. In this case, the best system performance can be obtained using a combination of changes to the source code algorithms and the compiler optimization levels. In some cases, the compiler itself can also be changed (though this is often predetermined for projects).
In this section, we take the dot product example and create three different implementations with different source code and then use the compiler options to check the impact on performance. As the Cortex-M0+ in the Raspberry...