One of the most widely used unsupervised learning methods is clustering. Clustering aims to uncover structure in unlabeled data. The aim is to group together data instances, such that there is great similarity between instances of the same cluster, and little similarity between instances of different clusters. As with supervised learning methods, clustering can benefit from combining many base learners. In this chapter, we present k-means; a simple and widely used clustering algorithm. Furthermore, we discuss how ensembles can be used to improve the algorithm's performance. Finally, we use OpenEnsembles, a scikit-learn compatible Python library that implements ensemble clustering. The main topics covered in this chapter are as follows:
- How the K-means algorithm works
- Its strengths and weaknesses
- How ensembles can improve its performance
- Utilizing OpenEnsembles...