Processor selection through use cases
IoT and machine learning (ML) applications are not only rapidly evolving and changing the way modern businesses operate but also transforming our everyday experiences. As these applications evolve and become more complex, it is essential to make the right hardware choices that meet application requirements. Ultimately, the processor choice comes down to the right balance of functionality, cost, power, and performance. Defining your use case and workload requirements makes determining this balance a lot simpler.
In this section, we will walk through the requirements of some common consumer embedded use cases and determine the Arm Cortex-M processor choices that are ideally suited. The list of use cases and the resulting processor selections are not exhaustive, mainly highlighting that if workload requirements are well understood, the processor decision-making process becomes much easier.
Medical wearable
Let’s start with a smart medical wearable use case. The requirements of this wearable include that it will be a wrist-worn device, with long battery life and special sensors to continuously monitor heart activity. Security is a vital requirement as the wearable stores private medical data. Processing power is equally important, operating within the size and power constraints of a battery-operated wearable.
For this case, the Arm Cortex-M33 processor provides an excellent combination of security, processing power, and power consumption. Cortex-M33 includes security features for hardware-enforced isolation, known as TrustZone for Cortex-M. It reduces the potential for attacks by creating isolation between the critical firmware and the rest of the application. The Cortex-M33 has many optional hardware features including a digital signal processing (DSP) extension, memory protection unit (MPU), and a floating-point unit (FPU) for handling compute-intensive operations. The Arm custom instruction and coprocessor interface in the Cortex-M33 provide the customization and extensibility to address processing power demands while still decreasing power consumption.
Note that these hardware features are optional; once manufactured and sold, these features are either present or not. Make sure to check whether the microcontroller or development board you are buying has these Arm Cortex-M processor features enabled if desired.
Industrial flow sensor
Let’s take another use case as an example. Say you’re designing an industrial flow sensor that will be used to measure liquids and gases with great accuracy. It needs to be extremely reliable and have a small form factor. The primary requirement is that it will be low-power and work with this accuracy standalone for very long periods of time. A great central processing unit (CPU) choice for designing such an industrial sensor is the Arm Cortex-M0+, which combines low power consumption and processing-power capabilities. It is the most energy-efficient Cortex-M processor with an extremely small silicon area, making it the perfect fit for such constrained embedded applications.
IoT sensor
There are several use cases in the embedded market that require demanding DSP workloads to be executed with maximized efficiency. With the advancements in IoT, there has been an explosion in the number of connected smart devices. There are so many different sensors connected within these devices to collect data for measuring temperature, motion, health, and vision. The sensor data collected is often noisy and requires DSP computation tasks—for example, applying filters to extract the clean data. The Cortex-M4, Cortex-M7, Cortex-M33, and Cortex-M55 processors come with a DSP hardware extension addressing various performance requirements for different signal-processing workloads. They also have an optional FPU that provides high-performance generic code processing in addition to the DSP capabilities. If your workload requires the highest DSP performance, the Cortex-M7 is a great choice. The Cortex-M7 is widely available in microcontrollers and offers high performance due to its six-stage dual-issue instruction pipeline. It is also highly flexible with optional instruction and data caches and tightly coupled memory (TCM), making it easy to select a processor that has been manufactured to meet your specific application needs. Security has become a common requirement for sensors on connected devices to provide protection from physical attacks. If security is an essential requirement for your sensor application in addition to DSP performance, then Cortex-M33 could be a great fit with its TrustZone hardware security features.
With some of the newer sensing and control use cases, we see a common need for not only signal processing but also ML inference on endpoint devices. ML workloads are typically very demanding in terms of computation and memory bandwidth requirements. The significant advancements made in ML via optimization techniques have now made it possible for ML solutions to be deployed on edge devices.
ML
The primary use cases for ML on edge devices today are keyword spotting, speech recognition, object detection, and object recognition. The list of ML use cases is rapidly evolving even as we write this book, with autonomous driving, language translation, and industrial anomaly detection. These use cases can be broadly classified into three categories, as outlined here:
- Vibration and motion: Vibration and motion are used to analyze signals, monitor health, and assist with several industrial applications such as predictive maintenance and anomaly detection. For these applications, the installed sensors (generally accelerometers) are used to gather large amounts of data at various vibration levels. Signal processing is used to preprocess the signal data before any decision-making can be done using ML techniques.
- Voice and sound: Voice applications are in several markets, and we’ve become quite familiar with voice assistants through the deployment of smart speakers. Many other voice-enabled solutions are coming to the mass market. The voice-capture process consists of one or several microphones used for voice keyword detection. Keyword spotting and automatic speech recognition are the primary demanding computing operations of these voice-enabled devices. These tasks require significant DSP and ML computation.
- Vision: Vision applications are used in several areas for recognizing objects, being able to both sort and spot defects, and detecting movement. There is an increasing number of vision-based ML applications ranging from video doorbells and biometrics for face-unlocking to industrial anomaly detection.
Cortex-M processors ranging from the Cortex-M0 to the latest Cortex-M85 can run a broad range of these ML use cases at different performance points. Mapping the different workload performance needs and latency requirements of these use cases to the CPU’s feature capabilities greatly simplifies the process of hardware selection. The following diagram illustrates the range of ML use cases run on the Cortex-M family of processors today:
Figure 1.1 – ML on Cortex-M processors
For example, say you’re designing a smart speaker that is an always-on voice activation device—the Cortex-M55 is a great choice. The Cortex-M55 is a highly capable artificial intelligence (AI) processor in the Cortex-M series of processors. It’s the first in the Arm Cortex-M family of processors to feature the Helium technology, which provides a significant performance uplift for DSP and ML applications on small, embedded devices. Arm Helium technology is also known as the M-Profile Vector Extension (MVE), which adds over 150 new scalar and vector instructions and enables efficient computation of 8-bit, 16-bit, and 32-bit fixed-point data types. Signal processing-intensive applications such as audio processing widely use the 16-bit and 32-bit fixed-point formats. ML algorithms widely use 8-bit fixed-point data types for neural network (NN) computations. The Helium technology makes running ML workloads much faster and more energy-efficient in endpoint devices.
In Figure 1.1, there is also mention of Ethos-U55. This is not a CPU like the other Cortex-M processors but is instead a micro neural processing unit (NPU). It was developed to add significant acceleration to ML workloads while being small enough to be implemented in constrained embedded/IoT environments. When combined with the Cortex-M55, the Ethos-U55 provides a 480x uplift in ML performance compared to Cortex-M-based systems without the Ethos-U55! Keep an eye out for microcontrollers and boards that utilize the Ethos-U55, and learn more about it from a high level here: https://www.arm.com/products/silicon-ip-cpu/ethos/ethos-u55.
To summarize this section, one way to select processors is by understanding the use case, clearly defining requirements, ranking them, and identifying project constraints. This is a great place to start the processor selection process.
Next, we will look at using performance and power as metrics to analyze processor selection choices.