k-means versus Hierarchical Clustering
In the previous chapter, we explored the merits of k-means clustering. Now, it is important to explore where hierarchical clustering fits into the picture. As we mentioned in the Linkage section, there is some potential direct overlap when it comes to grouping data points together using centroids. Universal to all of the approaches we've mentioned so far is the use of a distance function to determine similarity. Due to our in-depth exploration in the previous chapter, we used the Euclidean distance here, but we understand that any distance function can be used to determine similarities.
In practice, here are some quick highlights for choosing one clustering method over another:
- Hierarchical clustering benefits from not needing to pass in an explicit "k" number of clusters a priori. This means that you can find all the potential clusters and decide which clusters make the most sense after the algorithm has completed...