Working details of YOLO
You Only Look Once (YOLO) and its variants are one of the prominent object detection algorithms. In this section, we will understand at a high level how YOLO works and the potential limitations of R-CNN-based object detection frameworks that YOLO overcomes.
First, let’s understand the possible limitations of R-CNN-based detection algorithms. In Faster R-CNN, we slide over the image using anchor boxes and identify regions likely to contain an object, and then make the bounding box corrections. However, in the fully connected layer, where only the detected region’s RoI pooling output is passed as input, in the case of regions that do not fully encompass the object (where the object is beyond the boundaries of the bounding box of region proposal), the network has to guess the real boundaries of the object, as it has not seen the full image (but has seen only the region proposal). YOLO comes in handy in such scenarios, as it looks at the whole...