Very deep convolutional networks for large-scale image recognition
In 2014, an interesting contribution to image recognition was presented in the paper Very Deep Convolutional Networks for Large-Scale Image Recognition, K. Simonyan and A. Zisserman [4]. The paper showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. One model in the paper denoted as D or VGG16 had 16 deep layers. An implementation in Java Caffe (see http://caffe.berkeleyvision.org/) was used for training the model on the ImageNet ILSVRC-2012 (see http://image-net.org/challenges/LSVRC/2012/) dataset, which includes images of 1,000 classes, and is split into three sets: training (1.3M images), validation (50K images), and testing (100K images). Each image is (224 x 224) on 3 channels. The model achieves 7.5% top-5 error (the error of the top 5 results) on ILSVRC-2012-val and 7.4% top-5 error on ILSVRC-2012-test.
According to the ImageNet...