The bias-variance trade-off
The generalization error curve in Figure 8.7 shows a minimum. In the preceding section, we gave a qualitative explanation of why we expected the generalization error to first decrease and then increase with increasing model complexity and why, therefore, this leads to a minimum in the generalization error curve. But to get a quantitative idea of why the generalization error curve displays a minimum and what controls its position, we need to dig into the math behind the curve.
The generalization error curve is made up of two competing contributions, one increasing with model complexity and the other decreasing. It is the competition between these two contributions that leads to the minimum. Those two contributions are, first, the bias in a model’s prediction at a holdout point, , and second, the variance in the model’s prediction at the holdout point, , with the variance arising from the sensitivity of the model’s prediction to the...