Revisiting bias, variance, and memorization
Ensemble methods can improve the result of regression or classification tasks in that they can be applied to a group of classifiers or regressors to help build a final, augmented model.
Since we are talking about performance, we must have a metric for improving performance. Ensemble methods are designed to either reduce the variance or the bias of the model. Sometimes, we want to reduce both to reach a balanced point somewhere on the bias-variance trade-off curve.
We mentioned the concepts of bias and variance several times in earlier chapters. To help you understand how the idea of ensemble learning originated, I will revisit these concepts from the perspective of data memorization.
Let's say the following schematic visualization represents the relationship between the training dataset and the real-world total dataset. The solid line shown in the following diagram separates the seen world and the unseen part: