In this chapter, we have learned about the working details of modern object detection algorithms: Faster R-CNN, YOLO, and SSD. We learned how they overcome the limitation of having two separate models – one for fetching region proposals and the other for fetching class and bounding box offsets on region proposals. Furthermore, we implemented Faster R-CNN using PyTorch, YOLO using darknet, and SSD from scratch.
In the next chapter, we will learn about image segmentation, which goes one step beyond object localization by identifying the pixels that correspond to an object.
Furthermore, in Chapter 15, Combining Computer Vision and NLP Techniques, we will learn about DETR, a transformer-based object detection algorithm, and in Chapter 10, Applications of Object Detection, and Segmentation, we will learn about the Detectron2 framework, which helps in not only detecting objects but also segmenting them in a single shot.