Exploring uncertainty sampling methods
Uncertainty sampling refers to querying data points for which the model is least certain about their prediction. These are samples the model finds most ambiguous and cannot confidently label on its own. Getting these high-uncertainty points labeled allows the model to clarify where its knowledge is lacking.
In uncertainty sampling, the active ML system queries instances for which the current model’s predictions exhibit high uncertainty. The goal is to select data points that are near the decision boundary between classes. Labeling these ambiguous examples helps the model gain confidence in areas where its knowledge is weakest.
Uncertainty sampling methods select data points close to the decision boundary because points near this boundary exhibit the highest prediction ambiguity. The decision boundary is defined as the point where the model shows the most uncertainty in distinguishing between different classes for a given input. Points...