One of the drawbacks of region proposal based CNN is that it does not enable a real-time object recognition, as selective search takes considerable time to propose regions. This results in region proposal-based object detection algorithms not being useful in cases like self-driving car, where real-time detection is very important.
In order to achieve real-time detection, we will build a model that is inspired by the You Only Look Once (YOLO) algorithm from scratch that looks at the images that contain a person in image and draws a bounding box around the person in image.