First, let's read the dataset and transform it the way we need. Import the os library and declare the directory in which the dataset is present, as shown in the following code:
import os
annotation_dir = 'Flickr8k_text'
Next, define a function to open a file and return the lines present in the file as a list:
def read_file(file_name):
with open(os.path.join(annotation_dir, file_name), 'rb') as file_handle:
file_lines = file_handle.read().splitlines()
return file_lines
Read the image paths of the training and testing datasets followed by the captions file:
train_image_paths = read_file('Flickr_8k.trainImages.txt')
test_image_paths = read_file('Flickr_8k.testImages.txt')
captions = read_file('Flickr8k.token.txt')
print(len(train_image_paths))
print(len(test_image_paths))
print...