Least Squares
Least squares or least squares regression is probably a term you’ve heard before. Why is that so? It is because it is an extremely versatile but simple technique. These characteristics of least squares stem from the properties of the squared-loss function. So to start we’ll delve into the squared-loss function in a bit more detail.
The squared-loss function
The squared-loss function in Eq. 5 is a function of the difference , and so we can write the squared loss in a slightly simpler form:
Eq. 8
The form of the function is shown in Figure 4.1:
Figure 4.1: The shape of the squared-loss function
For the squared loss, the empirical risk function can be written as follows:
Eq. 9
The model prediction, , obviously depends upon the model parameters, which we’ll denote by the vector , and the vector of feature values, , for which we are making the prediction. So, we denote our model as . The vertical...