Generating data for LSTMs
Here we will define how to extract a batch of data to train the LSTM. Whenever we process a fresh batch of data, the first input should be the image feature vector and the label should be SOS
. We will define a batch of data, where, if the first_sample
Boolean is True,
then the input is extracted from the image feature vectors, and if first_sample
is False
, the input is extracted from the word embeddings. Also, after generating a batch of data, we will move the cursor by one, so we get the next item in the sequence next time we generate a batch of data. This way we can unroll a sequence of batches of data for the LSTM where the first batch of the sequence is the image feature vectors, followed by the word embeddings of the captions corresponding to that batch of images.
# Fill each of the batch indices for b in range(self._batch_size): cap_id = cap_ids[b] # Current caption id # Current image feature vector cap_image_vec = self._image_data[self._fname_caption_tuples...