In this chapter, we took a second stab at unsupervised learning techniques by exploring PCA, examining what it is, and applying it in a practical fashion. We explored how it can be used to reduce the dimensionality and improve the understanding of the dataset when confronted with numerous highly correlated variables. Then, we applied it to real data of anthropometric measurements of US Army soldiers, using the resulting principal components in a regression analysis with MARS to predict a soldier's weight. Additionally, we explored ways to visualize the data and model results.
As an unsupervised learning technique, it requires some judgment along with trial and error to arrive at an optimal solution that is acceptable to business partners. Nevertheless, it is a powerful tool to extract latent insights and to support supervised learning.
In the next chapter, we will...