Evaluating the clusters
The objective of good quality clustering is that the data points that belong to the separate clusters should be differentiable. This implies the following:
- The data points that belong to the same cluster should be as similar as possible.
- Data points that belong to separate clusters should be as different as possible.
Human intuition can be used to evaluate the clustering results by visualizing the clusters, but there are mathematical methods that can quantify the quality of the clusters. They not only measure the tightness of each cluster (cohesion) and the separation between different clusters but also offer a numerical, hence objective, way to assess the quality of clustering. Silhouette analysis is one such technique that compares the tightness and separation in the clusters created by the k-means algorithm. It’s a metric that quantifies the degree of cohesion and separation in clusters. While this technique has been mentioned...