In the previous recipe, we focused on localizing an object by predicting a bounding box. However, in some cases, you'll want to know the exact location of an object and a box around the object is not sufficient. We also call this segmentation—putting a mask on an object. To predict the masks of objects, we will use the popular U-net model structure. The U-net model has proven to be state-of-the-art by winning multiple image segmentation competitions. A U-net model is a special type of encoder-decoder network with skip connections, convolutional blocks, and upscaling convolutions.
In the following recipe, we will show you how to segment objects in images. Specifically, we will be segmenting the background. To implement the U-net network architecture, we will use the Keras framework.Â