Before ending this chapter, I want to introduce the reader to a very powerful algorithm called t-Distributed Stochastic Neighbor Embedding (t-SNE), which can be employed to visualize high-dimensional dataset also in 2D plots. In fact, one the hardest problems that every data scientist has to face is to understand the structure of a complex dataset without the support of graphs. This algorithm has been proposed by Van der Maaten and Hinton (in Visualizing High-Dimensional Data Using t-SNE, Van der Maaten L.J.P., Hinton G.E., Journal of Machine Learning Research 9 (Nov), 2008), and can be used to reduce the dimensionality trying to preserve the internal relationships. A complete discussion is beyond the scope of this book (but the reader can check out the aforementioned paper and Mastering Machine Learning Algorithms, Bonaccorso...




















































