In this section, we're going to build a neural network that can identify the genre of a song. We will use the GTZAN Genre Collection (http://marsyasweb.appspot.com/download/data_sets/.GTZAN Genre Collection). It has 1,000 different songs from over 10 different genres. There are 100 songs per genre and each song is about 30 seconds long.
We will use the  Python library, librosa to extract features from the songs. We will use Mel-frequency cepstral coefficients (MFCC). MFCC values mimic human hearing and they are commonly used in speech recognition applications as well as music genre detection. These MFCC values will be fed directly into the neural network.
To help us understand the MFCC, let's use two examples. Download Kick Loop 5 by Stereo Surgeon. You can do this by visiting https://freesound.org/people...