K-means clustering is but one concrete application of a more general algorithm known as expectation-maximization. In short, the algorithm works as follows:
- Start with some random cluster centers.
- Repeat until convergence:
-
- Expectation step: Assign all data points to their nearest cluster center.
- Maximization step: Update the cluster centers by taking the mean of all the points in the cluster.
Here, the expectation step is so named because it involves updating our expectation of which cluster each point in the dataset belongs to. The maximization step is so named because it involves maximizing a fitness function that defines the location of the cluster centers. In the case of k-means, maximization is performed by taking the arithmetic mean of all the data points in a cluster.
This should become clearer with the following figure:
Expectation...