Very deep convolutional networks for large-scale image recognition
During 2014, an interesting contribution to image recognition was presented with the paper, Very Deep Convolutional Networks for Large-Scale Image Recognition, K. Simonyan and A. Zisserman [4]. The paper showed that a "significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers." One model in the paper denoted as D or VGG-16 had 16 deep layers.
An implementation in Java Caffe (http://caffe.berkeleyvision.org/) was used for training the model on the ImageNet ILSVRC-2012 (http://image-net.org/challenges/LSVRC/2012/) dataset, which includes images of 1,000 classes, and is split into three sets: training (1.3 million images), validation (50,000 images), and testing (100,000 images). Each image is (224×224) on 3 channels. The model achieves 7.5% top-5 error on ILSVRC-2012-val, 7.4% top-5 error on ILSVRC-2012-test.
According to the ImageNet site...