Why does deep learning produce miscalibrated predictions?
The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is an annual competition where research teams evaluate their algorithms on a given dataset, aiming to push the boundaries of computer vision. 2012 was a watershed moment for the field, marking a significant shift towards the dominance of deep learning in computer vision (https://www.image-net.org/challenges/LSVRC/2012/).
Before the advent of deep learning, computer vision primarily relied on hand-engineered features and traditional machine learning techniques. Algorithms such as Scale-Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), and Speeded-Up Robust Features (SURF) were commonly used to extract features from images. These features would then be fed into machine learning classifiers such as Support Vector Machines (SVM) to make predictions. While these methods had their successes, they had significant limitations regarding scalability...