Designing and training an LSTM RNN model
In this project, the model designed for classifying music genres is an LSTM RNN, as illustrated in the following diagram:
Figure 6.9: LSTM recurrent neural network for music genre classification
As shown in the previous image, the MFCCs extracted from 1 second of raw audio are the input for the model, which consists of the following layers:
- 2 x LSTM layers with 32 number of units each (Num. units)
- 1 x Dropout layer with a 50% rate (0.5)
- 1 x Fully connected layer with three output neurons, followed by a Softmax activation function
In this recipe, we will design and train this LSTM model with TensorFlow.
Getting ready
In Chapter 4, Using Edge Impulse and the Arduino Nano to Control LEDs with Voice Commands, we addressed an audio classification problem using a standard convolutional neural network (CNN) that learned visual patterns from the Mel-filterbank energy...