All of the algorithms that were analyzed in Chapter 4, Hierarchical Clustering in Action, belong to the family of hard clustering methods. This means that a given sample is always assigned to a single cluster. On the other hand, soft clustering is aimed at associating each sample, xi with a vector, generally representing the probability that xi belongs to every cluster:
Alternatively, the output can be interpreted as a membership vector:
Formally, there are no differences between the two versions, but normally, the latter is employed when the algorithm is not explicitly based on a probability distribution. However, for our purposes, we always associate c(xi) with a probability. In this way, the reader is incentivized to think about the data-generating process that has been used to obtain the dataset. A clear example is the interpretation of such vectors as the...