Unsupervised learning
Unsupervised learning is a way of making hidden patterns in data visible:
- Clustering finds groups or hierarchy of similar objects
- Unsupervised anomaly detection finds outliers (weird samples)
- Dimensionality reduction finds which details of data are the most important
- Factor analysis reveals the latent variables that influence the behavior of the observed variables
- Rule mining finds associations between different entities in the data
As usually, these tasks overlap pretty often, and many practical problems inhabit the neutral territory between supervised and unsupervised learning.
We will focus on clustering in this chapter and on rule mining in the next chapter. Others will remain mostly beyond the scope of this book, but in Chapter 10, Natural Language Processing, we will nevertheless briefly discuss autoencoders; they can be used for both dimensionality reduction and anomaly detection.
Here are some examples of real-world tasks where clustering would be your tool of choice...