Advanced analysis - undirected methods
Data mining and machine learning techniques are divided into two main classes:
- The directed, or supervised approach: You use known examples and apply information to unknown examples to predict selected target variable(s)
- The undirected, or unsupervised approach: You discover new patterns inside the dataset as a whole
The most common undirected techniques are clustering, dimensionality reduction, and affinity grouping, also known as basket analysis or association rules. An example of clustering is looking through a large number of initially undifferentiated customers and trying to see if they fall into natural groupings based on similarities or dissimilarities of their features. This is a pure example of "undirected data mining" where the user has no preordained agenda and hopes that the data mining tool will reveal some meaningful structure. Affinity grouping is a special kind of clustering that identifies events or transactions that occur simultaneously...