Speech recognition, also known as Automatic Speech Recognition (ASR) and speech-to-text (STT/S2T), has a long history. More traditional AI approaches have been used in the industry for a long time; however, with recent interest in deep learning speech, recognition is getting a new boost in performance. Many major tech companies of the world have an interest in speech recognition because of the different applications for which it can be used, for example, Voice Search by Google, Siri by Apple, and Alexa by Amazon.
Many companies use pre-trained speech recognition software. However, in the following recipe, we will demonstrate how to implement and train a speech recognition pipeline from scratch. The accuracy of this newly trained model will be lower than the ones used in the industry. The main reason is that the quality...