In the past few decades, there has been a tremendous amount of research on leveraging deep learning for speech-related applications. Speech recognition has become a part of many day-to-day applications, such as our phones, smartwatches, homes, games, and many more.
It's being implemented as a salient feature in many voice search applications such as Siri and Alexa by tech giants such as Apple and Amazon, respectively. Sound waves are time-domain signals, which means that when we plot a sound wave, one of the axes is time (independent variable) and the other is the amplitude of the wave (dependent variable).
To create a digital recording of the sound wave, we convert the analog sound signal into a digital form by performing sampling. Sampling converts the analog audio signal into a digital signal by taking measurements of the dependent variable...