Understanding clustering
Clustering is a technique to divide data into groups (clusters) that are useful and meaningful. The clusters are formed capturing the natural structure of the data, which have meaningful relations with each other. It is also possible that this is only used at the preparation or the summarization stage for the other algorithms or further analysis. Cluster analysis has roles in many fields, such as biology, pattern recognition, information retrieval, and so on.
Clustering has applications in different fields:
Information retrieval: To segregate the information into particular clusters is an important step in searching and retrieving information from the numerous sources or a big pool of data. Let's use the example of news aggregating websites. They create clusters of similar types of news making it easier for the user to go through the interesting sections.
These news types can also have sub-classes creating a hierarchical view. For example, in the sports news section...