Classifying images of handwritten digits (that is, recognizing whether an image contains a 0 or a 1 and so on) is a historical problem in computer vision. The Modified National Institute of Standards and Technology (MNIST) dataset (http://yann.lecun.com/exdb/mnist/), which contains 70,000 grayscale images (28 × 28 pixels) of such digits, has been used as a reference over the years so that people can test their methods for this recognition task (Yann LeCun and Corinna Cortes hold all copyrights for this dataset, which is shown in the following diagram):
For digit classification, what we want is a network that takes one of these images as input and returns an output vector expressing how strongly the network believes the image corresponds to each class. The input vector has 28 × 28 = 784 values, while the output has 10 values (for the 10 different digits, from 0 to 9). In-between...