So far, we have focused our attention exclusively on supervised learning problems, where every data point in the dataset had a known label or target value. However, what do we do when there is no known output, or no teacher to supervise the learning algorithm?
This is what unsupervised learning is all about. In unsupervised learning, the learning is shown only in the input data and is asked to extract knowledge from this data without further instruction. We have already talked about one of the many forms that unsupervised learning comes in--dimensionality reduction. Another popular domain is cluster analysis, which aims to partition data into distinct groups of similar items.
In this chapter, we want to understand how different clustering algorithms can be used to extract hidden structures in simple, unlabeled datasets....