To begin with object detection, we will first see an overview of image recognition as detection is one part of it. In the following figure, an overview of object recognition is described using an image from Pascal VOC dataset. The input is passes through a model which then produces information in four different styles:
The model in the previous image performs generic image recognition where we can predict the following information:
- A class name for the object in the image
- Object center pixel location
- A bounding box surrounding the object as output
- In instance image where each pixel is classified into a class. The classes are for object as well as background
When we say object detection, we are usually referring to the first and third type of image recognition. Our goal is to estimate class names as well as bounding box surrounding target objects...