Google open sources DeepLab-v3+: A model for Semantic Image Segmentation using TensorFlow

DeepLab-v3+, Google’s latest and best performing Semantic Image Segmentation model is now open sourced!

DeepLab is a state-of-the-art deep learning model for semantic image segmentation, with the goal to assign semantic labels (e.g., person, dog, cat and so on) to every pixel in the input image. Assigning these semantic labels sets a much stricter localization accuracy requirements than other visual entity recognition tasks such as image-level classification or bounding box-level detection. Examples of semantic image segmentation tasks include synthetic shallow depth-of-field effect shipped in the portrait mode of the Pixel 2 and Pixel 2 XL smartphones and mobile real-time video segmentation.

DeepLab-v3+ is implemented in TensorFlow and has its models built on top of a powerful convolutional neural network (CNN) backbone architecture for the most accurate results, intended for server-side deployment.

google-open-sources-deeplab-model-semantic-image-segmentation-using-tensorflow-img-0 Source: Google Research blog

Let’s have a look at some of the highlights of DeepLab v3:

Google has extended DeepLab-v3 to include a simple yet effective decoder module to refine the segmentation results especially along object boundaries.
In this encoder-decoder structure one can arbitrarily control the resolution of extracted encoder features by atrous convolution to trade-off precision and runtime.
They has also shared their Tensorflow model training and evaluation code, along with models already pre-trained on the Pascal VOC 2012 and Cityscapes benchmark semantic segmentation tasks.
This version also adopts two network backbones, MobileNetv2 and Xception. MobileNetv2 is a fast network structure designed for mobile devices. Xception is a powerful network structure intended for server-side deployment.