In this chapter, we studied the difference between supervised and unsupervised learning and looked at situations when unsupervised learning is applied. We studied the exploratory analysis application of unsupervised learning, where clustering approaches are used. We studied the k-means clustering and hierarchical clustering approaches in detail, and looked at examples of how they are applied.
We also looked at how clustering approaches can be implemented on Apache Spark on AWS clusters. In our experience, clustering tasks are generally done on larger datasets, and, hence, taking the setup of the cluster into account for such tasks is important. We discussed these nuances in this chapter.
As a data scientist, there are many situations where we analyze data with the sole purpose of extracting value from that data. You should consider clustering approaches in these cases...