Mesh R-CNN
This chapter is dedicated to a state-of-the-art model called Mesh R-CNN, which aims to combine two different but important tasks into one end-to-end model. It is a combination of the well-known image segmentation model Mask R-CNN and a new 3D structure prediction model. These two tasks were researched a lot separately.
Mask R-CNN is an object detection and instance segmentation algorithm that got the highest precision scores in benchmark datasets. It belongs to the R-CNN family and is a two-stage end-to-end object detection model.
Mesh R-CNN goes beyond the 2D object detection problem and outputs a 3D mesh of detected objects as well. If we think of the world, people see in 3D, which means the objects are 3D. So, why not have a detection model that outputs objects in 3D as well?
In this chapter, we are going to understand how Mesh R-CNN works. Moreover, we will dive deeper into understanding different elements and techniques used in models such as voxels, meshes...