MuseGAN – polyphonic music generation
The two models we have trained so far have been simplified versions of how music is actually perceived. While limited, both the attention-based LSTM model and the C-RNN-GAN based model helped us to understand the music generation process very well. In this section, we will build on what we've learned so far and make a move toward preparing a setup which is as close to the actual task of music generation as possible.
In 2017, Dong et al. presented a GAN-type framework for multi-track music generation in their work titled MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment.5 The paper is a detailed explanation of various music-related concepts and how Dong and team tackled them. To keep things within the scope of this chapter and without losing details, we will touch upon the important contributions of the work and then proceed toward the implementation. Before we get onto...