Understanding advanced CNN architectures
Research in computer vision has been moving forward both through incremental contributions and larger innovative leaps. Challenges organized by researchers and companies, inviting experts to submit new solutions in order to best solve a predefined task, have been playing a key role in triggering such instrumental contributions. The ImageNet classification contest (ImageNet Large Scale Visual Recognition Challenge(ILSVRC); see Chapter 1, Computer Vision and Neural Networks) is a perfect example. With its millions of images split into 1,000 fine-grained classes, it still represents a great challenge for daring researchers, even after the significant and symbolic victory of AlexNet in 2012.
In this section, we will present some of the classic deep learning methods that followed AlexNet in tackling ILSVRC, covering the reasons leading to their development and the contributions they brought.
VGG, a standard CNN architecture
The first network architecture...