Dimensionality reduction
In this recipe, you will learn about the concept of dimensionality reduction. This is the set of algorithms used by statisticians and data scientists when data has a large number of dimensions. It helps make computations and model designing easy. We will use the Principal Component Analysis (PCA) algorithm for this recipe.
Getting ready
To get started with this recipe, you have to have the MultivariateStats
Julia package installed and running. This can be done by entering Pkg.add("MultivariateStats")
in the Julia REPL. When using it for the first time, it might show a long list of warnings; however you can safely ignore them for the time being. They in no way affect the algorithms and techniques that we will use in this chapter.
How to do it...
Firstly, let's simulate about a hundred random observations, as a training set for the PCA algorithm which we will use. This can be done using the
randn()
function:X = randn(100,3) * [0.8 0.7; 0.9 0.5; 0.2 0.6]
Now, to fit...