Text-to-speech (TTS) is the act of converting text into intelligible and natural speech. Before we delve into deep learning approaches to handle TTS, we should ask ourselves the following questions: what are TTS systems for? And why do we need them in the first place?
Well, there are many use cases for TTS. One of the most obvious is that it allows blind people to listen to written content. Indeed, Braille-based books, devices, or signs are not always available, and blind people can't always have someone read to them. In the near future, there might be smart glasses that can describe the surrounding environment and read urban signs and text-based indications to their users.
Many people struggle from childhood with learning disabilities like dyslexia. Robust TTS systems can help them on a daily basis, increasing their productivity at school or...