Determining the optimal number of clusters
One popular method that is frequently adopted is the Elbow method. The idea of the Elbow method is to run K-means algorithms with different values of K – for example, from 1 cluster all the way to 10 – and for each value of K, calculate the sum of squared errors. Then, plot a chart of the sum of squared deviation (SSD) values. SSD is the sum of the squared difference and is used to measure variance. If the line chart looks like an arm, then the elbow on the arm is the value of K that is the best among the various K values. The method behind this approach is that SSD usually tends to decrease as the value of K is increased, and the goal of the evaluation method is also to aim for lower SSD or mean squared deviation (MSD) values. The elbow represents a starting point, where SSD starts to have diminishing returns when the K value increases.
In the following chart, you can see that the MSD value, when charted over different K...