Extending image-based approaches to videos
Images can be used for pose estimation, style transfer, image generation, segmentation, captioning, and so on. Similarly, these applications find a place in videos too. Using the temporal information may improve the predictions from images and vice versa. In this section, we will see how to extend these applications to videos.
Regressing the human pose
Human pose estimation is an important application of video data and can improve other tasks such as action recognition. First, let's see a description of the datasets available for pose estimation:
- Poses in the wild dataset: Contains 30 videos annotated with the human pose. The dataset link is: https://lear.inrialpes.fr/research/posesinthewild/. The dataset is annotated with human upper body joints.
- Frames Labeled In Cinema (FLIC): A human pose dataset obtained from 30 movies, available at: https://bensapp.github.io/flic-dataset.html.
Pfister et al. (https://www.cv-foundation.org/openaccess/content_iccv_2015...