Introducing the hierarchical clustering algorithm
Hierarchical clustering is another unsupervised machine learning algorithm that seeks to build a hierarchy of clusters. To achieve this aim, it constructs a tree-like structure called a dendrogram that shows the hierarchical relationship between objects in a dataset. Typically, there are two ways to construct the dendrogram: the agglomerative clustering approach or the divisive clustering one. The first option is more common and follows a bottom-up approach by sequentially merging similar clusters. In divisive clustering, we put all observations in one big cluster and then successively split the clusters. A top-down approach is adopted in this case. Figure 10.11 shows an example of a dendrogram with the fusions or divisions made at each successive stage:
Figure 10.11 – Hierarchical clustering dendrogram
Next, we examine the basic steps of agglomerative clustering. To facilitate understanding, we reuse...