Mesh R-CNN architecture
3D shape detection has captured the interest of many researchers. Many models have been developed that have gotten good accuracy, but they mostly focused on synthetic benchmarks and isolated objects:
Figure 10.3: 3D object examples of the ShapeNet dataset
At the same time, 2D object detection and image segmentation problems have had rapid advances as well. Many models and architectures solve this problem with high accuracy and speed. There are solutions for localizing objects and detecting the bounding boxes and masks. One of them is called Mask R-CNN, which is a model for object detection and instance segmentation. This model is state-of-the-art and has a lot of real-life applications.
However, we see the world in 3D. The authors of the Mesh R-CNN paper decided to combine these two approaches into a single solution: a model that detects the object on a realistic image and outputs the 3D mesh instead of the mask. The new model takes...