As we explained in Chapter 1, Getting Started with Unsupervised Learning, the main goal of a cluster analysis is to group the elements of a dataset according to a similarity measure or a proximity criterion. In the first part of this chapter, we are going to focus on the former approach, while in the second part and in the next chapter, we will analyze more generic methods that exploit other geometric features of the dataset.
Let's take a data generating process pdata(x) and draw N samples from it:
It's possible to assume that the probability space of pdata(x) is partitionable into (potentially infinite) configurations containing K (for K=1,2, ...) regions so that pdata(x; k) represents the probability of a sample belonging to a cluster k. In this way, we are stating that every possible clustering structure is already existing when pdata(x...