In this chapter, we'll cover clustering. Clustering is often grouped with unsupervised techniques. These techniques assume that we do not know the outcome variable. This leads to ambiguity in outcomes and objectives in practice, but nevertheless, clustering can be useful. As we'll see, we can use clustering to localize our estimates in a supervised setting. This is perhaps why clustering is so effective; it can handle a wide range of situations, and often the results are, for the lack of a better term, sane.
We'll walk through a wide variety of applications in this chapter, from image processing to regression and outlier detection. Clustering is related to classification of categories. You have a finite set of blobs or categories. Unlike classification, you do not know the categories in advance. Additionally, clustering can often be viewed through a...