Understanding classifiers
Deep learning can be used for many different tasks. For what concerns images and CNN, a very common task is classification. Given an image, the neural network needs to classify it, using one of the labels provided during training. Not surprisingly, a network of this type is called a classifier.
To do so, the neural network will have one output for each label (for example, on the 10 digits MNIST dataset, we have 10 labels and so 10 outputs) and only one output should be 1, while all the other outputs should be 0.
How will a neural network achieve this state? Well, it doesn't. The neural network produces floating point outputs as a result of the internal multiplications and sums, and very seldom you get a similar output. However, we can consider the highest value as the hot one (1), and all the others can be considered cold (0).
We usually apply a softmax layer at the end of the neural network, which converts the outputs in to probability, meaning...