In the previous section, we introduced the tools that are necessary for building a dataset based on information contained in the MIDI files from the full LMD dataset. In this section, we'll delve deeper into building a custom dataset by using external APIs such as the Last.fm API.
In this section, we'll use the LMD-matched distribution since it is (partially) matched with the MSD containing metadata information that will be useful for us, such as artist and title. That metadata can then be used in conjunction with Last.fm to get the song's genre. We'll also be extracting drum and piano instruments, just like we did in the previous section.