We will be using images taken from Google's Quick Draw! dataset. This is a public, that is, open source, the dataset of 50 million images in 345 categories, all of which were drawn in 20 seconds or less by over 15 million users taking part in the challenge. We will train on 10,000 images in 10 categories, some of which were chosen to be similar so that we can test the discriminatory power of the model. You can see examples of these images at https://quickdraw.withgoogle.com/data. The images are available in a variety of formats, all of which are described at https://github.com/googlecreativelab/quickdraw-dataset.
Here, we will use the images that have been stored as .npy files. The public dataset of .npy files is hosted at https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/numpy_bitmap?pli=1. From...