In this chapter, we are going to discuss the concept of hierarchical clustering, which is a powerful and widespread technique for generating a complete hierarchy of clustering configurations, starting with either a single cluster equivalent to the dataset (the divisive approach) or a number of clusters equal to the number of samples (the agglomerative approach). This method is particularly helpful when it's necessary to analyze the whole grouping process at once in order to understand, for example, how smaller clusters are merged into larger ones.
In particular, we will discuss the following topics:
- Hierarchical clustering strategies (divisive and agglomerative)
- Distance metrics and linkage methods
- Dendrograms and their interpretation
- Agglomerative clustering
- Cophenetic correlation as a performance measure
- Connectivity constraints