Optimization takeaways
We have evaluated the performance of the dot product while altering the following variables:
- Processor type
- Software source code
- Compiler and compiler options
To provide some helpful guidelines when optimizing a Cortex-M system, here is a summary table of the recorded cycle counts when altering the dot product algorithm and compiler flags across both the Pico and NXP boards. As described in the previous section, the Corstone-300 Arm Virtual Hardware system does not allow cycle-accurate measurements, so we will not include it in our summary table here:
RPi |
NXP |
||
Implementation |
Compiler Flags |
M0+ |
M33 |
1: Plain C code |
Debug |
41,668 |
2,398 |