In Chapter 7, Image Restoration with GANs, we explored how GANs can restore some of the pixels in images. Researchers have found a similar application in NLP where GANs can be trained to get rid of the noises in audio in order to enhance the quality of the recorded speeches. In this section, we will learn how to use SEGAN to reduce background noise in the audio and make the human voice in the noisy audio more audible.
Speech quality enhancement with SEGAN
SEGAN architecture
Speech Enhancement GAN (SEGAN) was proposed by Santiago Pascual, Antonio Bonafonte, and Joan Serrà in their paper, SEGAN: Speech Enhancement Generative Adversarial Network. It uses 1D convolutions to successfully remove noise from speech audio. You...