In the 1950s, Bell Labs was the pioneer in speech recognition. The early designed systems were limited to a single speaker and had a very limited vocabulary. After around 70 years of work, the current speech-recognition systems are able to work with speech from multiple speakers and can recognize thousands of words in multiple languages. A detailed discussion of all the techniques used is beyond the scope of this book as enough work has been done on each technique to have a book on itself.
But the general workflow for a speech-recognition system is to first capture the audio by converting the physical sound into an electrical signal using a microphone. The electrical signal generated by the microphone is analog and needs to be converted to a digital form for storage and processing, for which analog-to-digital converters are used. Once we have the speech in digital...