Summary
We began this chapter by learning about non-parametric statistics in a Bayesian setting and how we can represent statistical problems through the use of kernel functions, as an example, we used a kernelized version of linear regression to model non-linear responses. Then we moved on to an alternative way of building and conceptualizing kernel methods using Gaussian processes.
A Gaussian process is a generalization of the multivariate Gaussian distribution to infinitively many dimensions and is fully specified by a mean function and a covariance function. Since we can conceptually think of functions as infinitively long vectors, we can use Gaussian processes as priors for functions. In practice, we work with multivariate Gaussian distributions with as many dimensions as data points. To define their corresponding covariance function, we used properly parameterized kernels; and by learning about those hyper-parameters, we ended up learning about arbitrary complex and unknown functions...