Summary
In this chapter, we learnt about speech recognition. We discussed how to work with speech signals and the associated concepts. We learnt how to visualize audio signals. We talked about how to transform time domain audio signals into the frequency domain using Fourier Transforms. We discussed how to generate audio signals using predefined parameters.
We then used this concept to synthesize music by stitching tones together. We talked about MFCCs and how they are used in the real world. We understood how to extract frequency features from speech. We learnt how to use all these techniques to build a speech recognition system. In the next chapter, we will learn about object detection and tracking. We will use those concepts to build an engine that can track objects in a live video.