Voice services, such as Apple Siri, Amazon Alexa, Google Assistant, and Google Translate, have become more and more popular these days, as voice is the most natural and effective way for us to find information or accomplish tasks in certain scenarios. Many of those voice services are cloud-based because user speech can be pretty long and freeform, and automatic speech recognition (ASR) is very complicated and requires a lot of computing power. In fact, only in recent years has ASR in natural and noisy environments become feasible thanks to the breakthrough in deep learning.
But in some cases, it makes sense to be able to recognize simple speech commands offline on a device. For example, to control the movement of a Raspberry-Pi-driven robot, you don't need complicated voice commands, and not only is on-device ASR faster than a cloud-based...