In this project, we will use LibriSpeech ASR corpus (http://www.openslr.org/12/), which is 1,000 hours of 16 kHz-read English speech.
Let's use the following commands to download the corpus and unpack the LibriSpeech data:
mkdir -p data/librispeech
cd data/librispeech
wget http://www.openslr.org/resources/12/train-clean-100.tar.gz
wget http://www.openslr.org/resources/12/dev-clean.tar.gz
wget http://www.openslr.org/resources/12/test-clean.tar.gz
mkdir audio
cd audio
tar xvzf ../train-clean-100.tar.gz LibriSpeech/train-clean-100 --strip-components=1
tar xvzf ../dev-clean.tar.gz LibriSpeech/dev-clean --strip-components=1
tar xvzf ../test-clean.tar.gz LibriSpeech/test-clean --strip-components=1
This will take a while and once the process is completed, we will have the data folder structure, as shown in the following screenshot:
We now have three folders named as train...