Likelihood
To understand the probability distribution that the data follows, we’ll look at an explicit example of how a random component is incorporated into data.
A simple probabilistic model
We’ll start with the simplest way in which we can introduce a random component into our observations of the response (target) variable , namely by adding noise to a deterministic quantity. In fact, we’ll just consider the observations
in our dataset to be noise-corrupted versions of a model output
. So, we have this relationship:
Eq. 1
Here, is the noise value that has been added to the model output
to get the observation
for the
datapoint. The value
is a random variable. Without loss of generality, we can assume its expectation value is zero, so we have
. We can make this assumption because if the expectation of
was non-zero, it would mean we have a non-zero deterministic average contribution from
that we could just absorb into the definition of
...