The essential clustering algorithm that OpenCV provides is k-means clustering, which searches for a predestined number of k-clusters (or groups) from an unlabeled multi-dimensional data.
It achieves this by using two simple hypotheses about what optimal clustering should look like:
- The center of each cluster is basically the mean of all of the points belonging to that cluster, also known as the centroid.
- Each data point in that cluster is closer to its center than to all other cluster centers.
It's easiest to understand the algorithm by looking at a concrete example.