8.5 Gaussian process regression
Let’s assume we can model a variable Y as a function f of X plus some Gaussian noise:
If f is a linear function of X, then this assumption is essentially the same one we used in Chapter 4 when we discussed simple linear regression. In this chapter, instead, we are going to use a more general expression for f by setting a prior over it. In that way, we will be able to get more complex functions than linear. If we decided to use Gaussian processes as this prior, then we can write:
Here, represents a Gaussian process with the mean function μX and covariance function K(X,X′). Even though in practice, we always work with finite objects, we used the word function to indicate that mathematically, the mean and covariance are infinite objects.
I mentioned before that working with Gaussians is nice. For instance, if the prior distribution is a GP and the likelihood is a Gaussian distribution, then the posterior is also a GP and we can...