Chapter 10: Applying the Power of Deep Learning to Videos
Computer vision is focused on the understanding of visual data. Of course, that includes videos, which, at their core, are a sequence of images, which means we can leverage most of our knowledge regarding deep learning for image processing to videos and reap great results.
In this chapter, we'll start training a convolutional neuronal network to detect emotions in human faces, and then we'll learn how to apply it in a real-time context using our webcam.
Then, in the remaining recipes, we'll use very advanced implementations of architectures, hosted in TensorFlow Hub (TFHub), specially tailored to tackle interesting video-related problems such as action recognition, frames generation, and text-to-video retrieval.
Here are the recipes that we will be covering shortly:
- Detecting emotions in real time
- Recognizing actions with TensorFlow Hub
- Generating the middle frames of a video with TensorFlow...