As discussed earlier in the previous section, once precise masks are obtained, non-overlapping instances can be identified from them by applying proper algorithms. This post-processing is usually done using morphological functions, such as mask erosion and dilation.
Watershed transforms are another common family of algorithms that further segment the class masks into instances. These algorithms take a one-channel tensor and consider it as a topographic surface, where each value represents an elevation. Using various methods that we won't go into, they then extract the ridges' tops, representing the instance boundaries. Several implementations of these transforms are available, some of which are CNN-based, such as the Deep watershed transform for instance segmentation (Proceedings of the IEEE CVPR conference, 2017), by Min Bai and Raquel Urtasun from the University of Toronto. Inspired by the FCN architecture, their network takes for...