- What is the difference between a bounding box, an anchor box, and a ground truth box?
A bounding box is the smallest rectangle enclosing an object. An anchor box is a bounding box with a specific size. For each position in the image grid, there are usually several anchor boxes with different aspect ratios—square, vertical rectangle, and horizontal rectangle. By refining the size and the position of the anchor box, the object detection model generates predictions. A ground truth box is a bounding box corresponding to a specific object in the training set. If a model is trained perfectly, it generates predictions that are very close to ground truth boxes.
- What is the role of the feature extractor?
A feature extractor is a CNN that converts an image into a feature volume. The feature volume is usually smaller in dimension than the input image and contains meaningful features that can be passed to the remainder of the network in order to generate predictions.
- Which of...