A very common approach for k-shot learning is to train a large model with a related task for which we have a large dataset. This model is then fine-tuned with the k-shot specific task. Hence, the knowledge from the large dataset is distilled into the model, which augments the learning of new related tasks from just a few examples. In 2003, Bakker and Heskes introduced a probabilistic model for k-shot learning where all of the tasks share a common feature extractor but have a respective linear classifier with just a few task-specific parameters.
The probabilistic method to k-shot learning discussed here is very similar to the one introduced by Bakker and Heskes. This method solves the classification task (for images) by learning a probabilistic model from very little data. The idea is to use a powerful neural network that learns robust features from...