All of the algorithms that were analyzed in Chapter 4, Hierarchical Clustering in Action, belong to the family of hard clustering methods. This means that a given sample is always assigned to a single cluster. On the other hand, soft clustering is aimed at associating each sample, xi with a vector, generally representing the probability that xi belongs to every cluster:
![](https://static.packt-cdn.com/products/9781789348279/graphics/assets/48b059fe-49bc-493e-bcfb-7be6d9ac7576.png)
Alternatively, the output can be interpreted as a membership vector:
![](https://static.packt-cdn.com/products/9781789348279/graphics/assets/98a661ea-5050-4611-a22e-ffac81d0c972.png)
Formally, there are no differences between the two versions, but normally, the latter is employed when the algorithm is not explicitly based on a probability distribution. However, for our purposes, we always associate c(xi) with a probability. In this way, the reader is incentivized to think about the data-generating process that has been used to obtain the dataset. A clear example is the interpretation of such vectors as the...