Introduction to k-modes Clustering
All the types of clustering that we have studied so far are based on a distance metric. But what if we get a dataset in which it's not possible to measure the distance between variables in a traditional sense, as in the case of categorical variables? In such cases, we use k-modes clustering.
k-modes clustering is an extension of k-means clustering, dealing with modes instead of means. One of the major applications of k-modes clustering is analyzing categorical data such as survey results.
Steps for k-Modes Clustering
In statistics, mode is defined as the most frequently occurring value. So, for k-modes clustering, we're going to calculate the mode of categorical values to choose centers. So, the steps to perform k-modes clustering are as follows:
Choose any k number of random points as cluster centers.
Find the Hamming distance (discussed in Chapter 1, Introduction to Clustering Methods) of each point from the center.
Assign each point to a cluster whose center...