Let's consider a small dataset built by adding some uniform noise to the points belonging to a segment bounded between -6 and 6. The original equation is: y = x + 2 + n, where n is a noise term.
In the following figure, there's a plot with a candidate regression function:
As we're working on a plane, the regressor we're looking for is a function of only two parameters:
In order to fit our model, we must find the best parameters and to do that we choose an ordinary least squares approach. The loss function to minimize is:
With an analytic approach, in order to find the global minimum, we must impose:
So (for simplicity, it accepts a vector containing both variables):
import numpy as np
def loss(v):
e = 0.0
for i in range(nb_samples):
e += np.square(v[0] + v[1]*X[i] - Y[i])
return 0.5 * e
And the gradient can be defined...