So far, we have learned how to predict a bounding rectangle on 2D images using algorithms that have the core underlying concept of anchor boxes. We will now learn how the same concept can be extended to predict 3D bounding boxes around objects.
In a self-driving car, tasks such as pedestrian/obstacle detection and route planning cannot happen without knowing the environment. Predicting 3D object locations along with their orientations becomes an important task. Not only is the 2D bounding box around obstacles important, but also knowing the distance from the object, height, width, and orientation of the obstacle are critical to navigating safely in the 3D world.
In this section, we will learn how YOLO is used to predict the 3D orientation and position of cars and pedestrians on a real-world dataset.