Contrasting variance and bias
Imagine that you have the data points displayed in the following graph. Your task is to fit a line or curve that will allow you to make predictions for new points.
Here is a graph of random points:

Figure 2.3 – Graph of random points
One idea is to use Linear Regression, which minimizes the square of the distance between each point and the line, as shown in the following graph:

Figure 2.4 – Minimizing distance using Linear Regression
A straight line generally has high bias. In machine learning bias is a mathematical term that comes from estimating the error when applying the model to a real-life problem. The bias of the straight line is high because the predictions are restricted to the line and fail to account for changes in the data.
In many cases, a straight line is not complex enough to make accurate predictions. When this happens, we say that the machine learning model...