1. Object detection
In object detection, the objective is to localize and identify an object in an image. Figure 11.1.1 shows object detection where the target is a Soda can. Localization means that the bounding box of the object must be estimated. Using upper left corner pixel and lower right corner pixel coordinates is a common convention that is used to describe a bounding box. In Figure 11.1.1, the upper left corner pixel has coordinates. (xmin,ymin), while the lower right corner pixel has coordinates (xmax,ymax).The pixel coordinate system has the origin (0,0) at the upper left corner pixel of the entire image.
While performing localization, detection must also identify the object. Identification is the classic recognition or classification task in computer vision. At the minimum, object detection must identify if a bounding box belongs to a known object or to the background. An object detection network can be trained to detect one specific object only, like the Soda can...