Anomaly detection is a much-studied branch of machine learning. The term is simplistic in its meaning. Basically, it is a collection of methods for detecting anomalies. Imagine a bag of apples. To identify and pick out the bad apples would be an act of anomaly detection.
Anomaly detection is performed in several ways:
- By identifying data samples in the dataset that are very different from the rest of the samples by using minimum-maximum ranges of columns
- By plotting the data as a line graph and identifying sudden spikes in the graph
- By plotting the data around a Gaussian curve and marking the points lying on the extreme ends as outliers (anomalies)
Some of the commonly used methods are support vector machines, Bayesian networks, and k-nearest neighbors. We will focus on anomaly detection in relation to security in this section...