Introduction
This chapter presents techniques for unsupervised learning that accomplish something called dimension reduction. First, we will discuss what a dimension is, why we want to avoid having too many dimensions, and the basic idea of dimension reduction. The chapter then covers two dimension reduction techniques in detail: market basket analysis and Principal Component Analysis (PCA). Market basket analysis is a technique for generating associative rules in datasets. The chapter will contain a walk-through of detailed R code that accomplishes this. PCA, a very common dimension reduction technique, comes from theoretical linear algebra. The chapter will also show a detailed walk-through of how to accomplish PCA with R.
The Idea of Dimension Reduction
The dimensions of a dataset are nothing more than the collection of distinct numbers that are required to describe observations in it. For example, consider the position of Pac-Man in the game named after him. Pac-Man is a game that was...