It is a variation of generic KMeans.
The steps of the algorithm are:
- Initialize by randomly selecting a point, say then compute the centroid w of M and compute:
The centroid is the center of the cluster. A centroid is a vector containing one number for each variable, where each number is the mean of a variable for the observations in that cluster.
- Divide M =[x1, x2, ... xn] into two, sub-clusters ML and MR, according to the following rule:
- Compute the centroids of ML and MR, wL and wR, as in step 2.
- If wL = cL and wR = cR, stop.
Otherwise, let cL= wL cR = wR , go to step 2.