Introducing semantic segmentation
In the previous chapters, we implemented several classifiers, where we provided an image as input and the network said what it was. This can be excellent in many situations, but to be very useful, it usually needs to be combined with a method that can identify the region of interest. We did this in Chapter 7, Detecting Pedestrians and Traffic Lights, where we used SSD to identify a region of interest with a traffic light and then our neural network was able to tell the color. But even this would not be very useful to us, because the regions of interest produced by SSD are rectangles, and therefore a network telling us that there is a road basically as big as the image would not provide much information: is the road straight? Is there a turn? We cannot know. We need more precision.
If object detectors such as SSD brought classification to the next level, now we need to reach the level after that, and maybe more. In fact, we want to classify every...