Large-scale video processing with neural networks
In this paper, https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42455.pdf, the authors explore how CNNs could be used for large-scale video classification. In this use case, the neural networks have access to not only the appearance information in single, static images, but also the complex temporal evolution of the image. There are several challenges in extending and applying CNNs in this setting.
There are very few (or none) video classification benchmarks that match the scale and variety of existing image datasets as videos are significantly more challenging to collect, annotate, and store. To obtain sufficient amount of data needed to train our CNN architectures, authors collected a new Sports-1M dataset. This dataset contains 1 million videos (from YouTube) belonging to a taxonomy of 487 classes of sports. Sports-1M is also available to the research community to support future work in this area.
In this work...