So far, we have seen how to develop deep-learning-based projects on numerals and images. However, applying similar techniques to video clips, for example, for human activity recognition from video, is not straightforward.
In this chapter, we will see how to apply deep learning approaches to a video dataset. We will describe how to process and extract features from a large collection of video clips. Then we will make the overall pipeline scalable and faster by distributing the training on multiple devices (CPUs and GPUs), and run them in parallel.
We will see a complete example of how to develop a deep learning application that accurately classifies a large collection of a video dataset, such as UCF101 dataset, using a combined CNN and LSTM network with Deeplearning4j (DL4J). This overcomes...