Introduction to clustering – what, why, and how?
Now let us discuss the various aspects of clustering in greater detail.
What is clustering?
Clustering basically means the following:
Creating a group with a high similarity among the members of clusters
Creating a group with a significant distinction or dissimilarity between the members of two different clusters
The clustering algorithms work on calculating the similarity or dissimilarity between the observations to group them in clusters.
How is clustering used?
Let us look at the plot of Monthly Income and Monthly Expense for a group of 400 people. As one can see, there are visible clusters of people whose earnings and expenses are different from people from other clusters, but are very similar to the people in the cluster they belong to:
In the preceding plot, the visible clusters of the people can be identified based on their income and expense levels, as follows:
1...