In the previous chapters, we saw several examples of supervised learning, covering both classification and regression. We performed supervised learning techniques on structured and labelled data. However, as we mentioned previously, with the rise of cloud computing, IoT, and social media, unstructured data is increasing unprecedentedly. Collectively, more than 80% of this data is unstructured and which most of them are unlabeled.
Unsupervised learning techniques, such as clustering analysis and dimensionality reduction, are two of the key applications in data-driven research and industry settings for finding hidden structures in unstructured datasets. There are many clustering algorithms being proposed for this, such as k-means, bisecting k-means, and the Gaussian mixture model. However, these algorithms cannot perform with high...