Choosing the Number of Clusters
In the previous chapter, we just used a predefined number of clusters, but in the real world, we don’t always know what number of clusters to expect. There are different ways of trying to come up with the correct number of clusters. In this chapter, we will start with two. First, we will learn about simple visual inspection, which has the advantages of being easy and intuitive but relies heavily on individual judgement and subjectivity. We will then learn about the elbow method with sum of squared errors, which is partially quantitative but still relies on individual judgement and is more abstract than choosing based on visual inspection. Later in this chapter, we will also learn about using the silhouette score, which removes subjectivity from the judgment but is also quite abstract.
As we learn about these different methods, there is one overriding principle you should keep in mind: the quantitative measures only tell you how well that number of clusters...