Summary
In this chapter, we introduced the concept of semantic segmentation, an approach that gives our applications increased perceptual understanding of our photos and videos. It works by training a model to assign each pixel to a specific class. One popular architecture for this is U-Net, which achieves high-precision localization by preserving spatial information, by bridging the convolutional layers. We then reviewed the data used for training along with some example outputs of the model, including examples that highlight the limitations of the model.
We then saw how this model could be used by creating an image effects application, where the segmented images were used to clip people from a series of frames and composite them together to create an action shot. But this is just one example of how semantic segmentation can be applied; it's frequently used in domains such as robotics, security surveillance, and quality assurance in factories, to name a few. How else it can be applied is...