Unsupervised learning methods
In unsupervised learning, the task is to train a model that represents the input data, without having the target data available. This section will help you understand the most prominent unsupervised learning methods and give you an idea of when they are useful in cybersecurity scenarios.
Typically, the learned model in unsupervised learning provides a compressed representation of the input data that abstracts away the noise and uncovers the latent structure in the dataset. For instance, in clustering, the input data is represented by the clusters that have been uncovered by the clustering model. The parameters of the model govern the mapping between the input data and the cluster IDs.
The following figure illustrates the underlying structure of input data. In this case, the data can be naturally grouped into three clusters:
Figure 5.12 – Example of clusters in data
In the following subsection, we’ll describe...