Chapter 6. Data Analysis – Clustering
Clustering is the process of trying to make groups of objects that are more similar to each other than objects in other groups. Clustering is also called cluster analysis.
R has several tools to cluster your data (which we will investigate in this chapter):
- K-means, including optimal number of clusters
- Partitioning Around Medoids (PAM)
- Bayesian hierarchical clustering
- Affinity propagation clustering
- Computing a gap statistic to estimate the number of clusters
- Hierarchical clustering