Chapter 7: Analyzing Outliers for Data Anomalies
In this chapter, we are going to use the K-means grouping function to find the outliers of three of the most used datasets in Kaggle: credit card fraud detection, suspicious logins, and insurance money amount complaints.
2D and 3D charts help us to understand the possible outliers that could lead to fraud in credit card transactions, possible security breaches in login attempts, and the special cases that demand more money from insurance companies.
The methodology of this chapter is to visualize the outliers in charting 2D and 3D variables to get familiar with the data and find possible out-of-the-ordinary behavior and the possible number of groups. Then, we'll use pivot chart business intelligence to classify the ranges of the groups and identify the groups and variables that lead to outliers.
With these practical datasets, we will get experience in applying the K-means function add-in of Excel to other real data. This...