Transforming features in a streaming context
Scaling data is a way of pre-processing data for machine learning, but many other statistical methods can be used for data preparation. In this second part of this chapter let's deep dive into the principal component analysis (PCA) method, a much-used method for preparing data at the beginning of any machine learning.
Introducing PCA
PCA is a machine learning method that can be used for multiple applications. When working with highly multivariate data, PCA can be used in an interpretative way, where you use it to make sense of and analyze multivariate datasets. This is a use of PCA in data analysis.
Another way to use PCA is to prepare data for machine learning. From a high-level point of view, PCA could be seen as an alternative to scaling that reduces the number of variables of your data to make it easier for the model to fit. This is the use of PCA that is most relevant for the current chapter, and this is how it will be...