Defining a tf.data.Dataset
Now let’s look at how we can create a tf.data.Dataset
using the data. We will first write a few helper functions. Namely, we’ll define:
parse_image()
to load and process an image from afilepath
generate_tokenizer()
to generate a tokenizer trained on the data passed to the function
First let’s discuss the parse_image()
function. It takes three arguments:
filepath
– Location of the imageresize_height
– Height to resize the image toresize_width
– Width to resize the image to
The function is defined as follows:
def parse_image(filepath, resize_height, resize_width):
""" Reading an image from a given filepath """
# Reading the image
image = tf.io.read_file(filepath)
# Decode the JPEG, make sure there are 3 channels in the output
image = tf.io.decode_jpeg(image, channels=3)
image = tf.image.convert_image_dtype...