An alternative to k-means is hierarchical clustering. One advantage of hierarchical clustering is that it allows us to organize the different clusters in a hierarchy (also known as a dendrogram), which can make it easier to interpret the results. Another useful advantage is that we do not need to specify the number of clusters upfront.
Organizing clusters as a hierarchical tree
Understanding hierarchical clustering
There are two approaches to hierarchical clustering:
- In agglomerative hierarchical clustering, we start with each data point potentially being its own cluster, and we subsequently merge the closest pair of clusters until only one cluster remains.
- In divisive hierarchical clustering, it's the other way around...