In this chapter, we looked at sampling, interpolating, and humanizing scores using a variational autoencoder with the MusicVAE and GrooVAE models.
We first explained what is latent space in AE and how dimensionality reduction is used in an encoder and decoder pair to force the network to learn important features during the training phase. We also learned about VAEs and their continuous latent space, making it possible to sample any point in the space as well as interpolate smoothly between two points, both very useful tools in music generation.
Then, we wrote code to sample and transform a sequence. We learned how to initialize a model from a pre-trained checkpoint, sample the latent space, interpolate between two sequences, and humanize a sequence. Along the way, we've learned important information on VAEs, such as the definition of the loss function and the KL divergence...