In the previous chapter, we introduced some state-of-the-art models for semantic segmentation and applied them to urban scenes in order to guide self-driving cars. In the related Jupyter notebooks, we provided an _augmentation_fn(img, gt_img) function passed to dataset.map() to augment the pictures and their ground truth label maps. Though we did not provide detailed explanations back then, this augmentation function illustrates well how tf.image can augment complex data.
For example, it offers a simple solution to the problem of transforming both the input images and their dense labels. Imagine we want some of the samples to be randomly horizontally flipped. If we call tf.image.random_flip_left_right() once for the input image and once for the ground truth label map, there is only a half chance that both images will undergo the same transformation.
One solution to ensure that the same set of geometrical transformations...