Clustering
As you've probably come to realize by now, when it comes to data science, there are almost always multiple avenues to attack a problem. At the algorithmic level, depending on the particularities of the data and the specific problem we're trying to solve, we'll usually have more than one option. A wealth of choices is usually good news as some algorithms can produce better results than others, depending on the specifics. Clustering is no exception—a few well-known algorithms are available, but we must understand their strengths and their limitations in order to avoid ending up with irrelevant clusters.
Scikit-learn, the famous Python machine learning library, drives the point home by using a few toy datasets. The datasets produce easily recognizable plots, making it easy for a human to identify the clusters. However, applying unsupervised learning algorithms will lead to strikingly different results—some of them in clear contradiction of what our human pattern recognition abilities...