In Chapter 6, Data Preparation for Training, we looked at how to build a dataset. The datasets we produced were symbolic ones composed of MIDI files containing specific instruments, such as percussion or piano, and from specific genres, such as dance music and jazz music.
We also looked at how to prepare a dataset, which corresponds to the action of preparing the input formats (MIDI, MusicXML, or ABCNotation) into a format that can be fed to the network. That format is specific to a Magenta model, meaning the preparation will be different for the Drums RNN and MusicVAE models, even if both models can train on percussion data.
The first step before starting the training is to choose the proper model and configuration for our use case. Remember, a model in Magenta defines a deep neural network architecture, and each network type has its advantages...