In this section, we will look at ways to judge the quality of a clustering scheme. The two approaches we will discuss include what's known as elbow analysis and silhouette analysis.
Evaluating the quality of a clustering scheme isn't well-defined. In unsupervised learning, there is no base truth to compare against, so we cannot say that a clustering scheme does a good job when compared to that base truth. Thus, we need to define an objective that a clustering scheme tries to achieve, such as minimizing the squared distances from cluster members to centroids or maximizing a likelihood function.
In this section, when I discuss clustering evaluation, I'm concerned primarily with deciding between clustering algorithms and choosing the number of clusters. The elbow method is a method for choosing the number of clusters to use in a clustering scheme...