In this section, we will be working with the CUB dataset, which is an image dataset of different bird species and can found at the following link: http://www.vision.caltech.edu/visipedia/CUB-200-2011.html. The CUB dataset contains 11,788 high-resolution images. We will also need the char-CNN-RNN text embeddings, which can be found at the following link: https://drive.google.com/open?id=0B3y_msrWZaXLT1BZdVdycDY5TEE. These are pretrained text embeddings. Follow the instructions given in the next few sections to download and extract the dataset.
Data preparation
Downloading the dataset
The CUB dataset can be downloaded manually from http://www.vision.caltech.edu/visipedia/CUB-200-2011.html. Alternatively, we can execute the following...