Connecting the relationship between regression and estimators
Recall that in Chapter 6, Parametric Estimation, we studied an example where we used Maximal Likelihood Estimation (MLE) to estimate the slope in a 0-intercept linear equation, which was formulated in the following form:
Note on MLE
Please review the MLE for modeling noise example in Chapter 6, Parametric Estimation. It is an example with rich content.
In the example, we assumed the distribution of the noise Є was normal N(0, 1) and showed that the log-likelihood takes the following form:
Note, if we add another parameter b, the derivation is still legitimate:
Taking the derivative with respective to k and b, respectively, we obtain:
We also obtain the following equation:
Note
The log-likelihood function only depends on each data point through , whose sum is exactly SSE. Maximizing the log-likelihood is equivalent to minimizing the squared error.
Now, we have two unknowns...