Now, we are going to illustrate how K-means works. This algorithm is the most famous clustering algorithm to be created and is widely used, regardless of the field. It is popular in both industry and academia due to its simplicity and efficiency. The purpose of this algorithm is to allocate a specific group to each sample in the dataset. Concretely, the task is to partition N samples into K clusters, in which each sample belongs to the nearest mean of the cluster.
The clustering problem is NP-hard in principle. K-means optimizes the loss function iteratively and converges the local minimum quickly. This means that it tries to minimize the given loss function step by step. In that sense, K-means is very similar to the supervised learning algorithms we introduced previously. The biggest difference is that the loss function can be calculated without any...