Hierarchical Cluster Analysis (HCA)
Hierarchical cluster analysis (HCA) is best implemented when the user does not have a priori number of clusters to build. Thus, it is a common approach to use HCA as a precursor to other clustering techniques where a predetermined number of clusters is recommended. HCA works by merging observations that are similar into clusters and continues merging clusters that are closest in proximity until all observations are merged into a single cluster.
HCA determines similarity as the Euclidean distance between and among observations and creates links at the distance in which the two points lie.
With the number of features indicated by n, the Euclidean distance is calculated using the formula:
Figure 4.1: The Euclidean distance
After the distance between observations and cluster have been calculated, the relationships between and among all observations are displayed using a dendrogram. Dendrograms are tree-like structures displaying horizontal...