So far, we have focused our attention exclusively on supervised learning problems, where every data point in the dataset had a known label or target value. However, what do we do when there is no known output or no teacher to supervise the learning algorithm?
This is what unsupervised learning is all about. In unsupervised learning, the learning process is shown only in the input data and is asked to extract knowledge from this data without further instruction. We have already talked about one of the many forms that unsupervised learning comes in—dimensionality reduction. Another popular domain is cluster analysis, which aims to partition data into distinct groups of similar items.
Some of the problems where clustering techniques can be useful are document analysis, image retrieval, finding spam emails, identifying...