Detecting and Analyzing Anomalies
The short definition of an anomaly is something that you don’t expect—something strange, out of the ordinary, or simply a deviation from the norm. You don’t expect to see values outside a specific numeric range when reviewing data—these values often called outliers because they lie outside the expected range. However, anomalies occur in all sorts of ways, many of which don’t fall into the category of outliers. For example, the data may simply not meet formatting requirements, or it may appear inconsistently, as with state names that are correct but presented in different ways.
Some people actually enjoy seeking anomalies, finding them amusing or at least interesting. The point is anomalies occur all the time, and they may appear harmless, but they have the potential to affect your business in various ways. The point of this chapter is to help you discover what anomalies are with regard to ML, how to determine what...