Finally, TensorFlow offers several functions to cache generated samples or to save tf.data pipeline states.
Samples can be cached by calling the dataset's .cache(filename) method. If cached, data will not have to undergo the same transformations when iterated over again (that is, for the next epochs). Note that the content of the cached data will not be the same depending on when the method is applied. Take the following example:
dataset = tf.data.TextLineDataset('/path/to/file.txt')
dataset_v1 = dataset.cache('cached_textlines.temp').map(parse_fn)
dataset_v2 = dataset.map(parse_fn).cache('cached_images.temp')
The first dataset will cache the samples returned by TextLineDataset, that is, the text lines (the cached data is stored in the specified file, cached_textlines.temp). The transformation done by parse_fn (for instance, opening and decoding the corresponding image file for each text line) will have to be repeated for each...