Detecting anomalies in data is a recurring theme in machine learning. In Chapter 10,Imbalanced Learning – Not Even 1% Win the Lottery, we learned how to spot these interesting minorities in our data. Back then, the data was labeled and the classification algorithms from the previous chapters were apt for the problem. Aside from labeled anomaly detection problems, there are cases where data is unlabeled.
In this chapter, we are going to learn how to identify outliers in our data, even when no labels are provided. We will use three different algorithms and we will learn about the two branches of unlabeled anomaly detection. Here are the topics that will be covered in this chapter:
- Unlabeled anomaly detection
- Detecting anomalies using basic statistics
- Detecting outliers using EllipticEnvelope
- Outlier and novelty detection...