Introducing Gaussian processes
The Gaussian process (GP) can be thought of as an alternative Bayesian approach to regression problems. They are also referred to as infinite dimensional Gaussian distributions. GP defines a priori over functions that can be converted into a posteriori once we have observed a few data points. Although it doesn’t seem possible to define distributions over functions, it turns out that we only need to define distributions over a function's values at observed data points.
Formally, let's say that we observed a function,
, at n values
 as
. The function is a GP if all of the values,Â
, are jointly Gaussian, with a mean ofÂ
 and a covariance ofÂ
  given by
. Here, theÂ
 function defines how two variables are related to each other. We will discuss different kinds of kernels later in this section. The joint Gaussian distribution of many Gaussian variables is also known as Multivariate Gaussian.Â
From the previous temperature example, we can imagine that various functions...