Deep Convolutional Neural Networks (DCNN) have been used in computer vision—for example, image classification, image feature extraction, object detection, and semantic segmentation. Despite such successes of state-of-the-art approaches for object detection from still images, detecting objects in a video is not an easy job.
Considering this drawback, in this chapter, we will develop an end-to-end project that will detect objects from video frames when a video clip plays continuously. We will be utilizing a trained YOLO model for transfer learning and JavaCV techniques on top of Deeplearning4j (DL4J) to do this. In short, the following topics will be covered throughout this end-to-end project:
- Object detection
- Challenges in object detection from videos
- Using YOLO with DL4J
- Frequently asked questions (FAQs)