Supervised learning may be the most common paradigm, and it is certainly the easiest to grasp. It applies when we want to teach neural networks a mapping between two modalities (for example, mapping images to their class labels or to their semantic masks). It requires access to a training dataset containing both the images and their ground truth labels (such as the class information per image or the semantic masks).
With this, the training is then straightforward:
- Give the images to the network and collect its results (that is, predicted labels).
- Evaluate the network's loss, that is, how wrong its predictions are when comparing it to the ground truth labels.
- Adjust the network parameters accordingly to reduce this loss.
- Repeat until the network converges, that is, until it cannot improve further on this training data.
Therefore, this strategy deserves the adjective supervised—an entity (us) supervises the training of the network by providing it with...