Summary
In this chapter, we learned that K-means helps us to classify large amounts of data that is very difficult to classify with a simple view or with business intelligence.
We discussed the application of segment groups in business sales and to research diseases that cause days of absence in human resources. We learned about K-means terms such as centroid, or the average of the group. The optimal group classification is compact and with a small standard deviation. The K-means elbow chart indicates the optimal number for group classification.
The K-means function does group classification for one or more variables. It is very difficult to visualize the probable classification of four or more variables because it is not possible to do a chart of the data. We also learned about outliers – points that have different behavior from the rest of the groups and could lead to fraud or system problems in the near future. In the next chapter, we will learn how to calculate groups...