Visualizing your Word2Vec model with t-SNE
When we attempt to visualize a high-dimensional vector to a 2D plot, we have to reduce the dimensions first. The most popular dimension reduction technique probably is Principal Component Analysis (PCA). However, PCA has limitations, as I have outlined in the article Dimension reduction with Python (https://towardsdatascience.com/dimension-reduction-techniques-with-python-f36ca7009e5c) [3]. In this section, I will give a brief introduction to t-SNE and use it to visualize our model.
t-SNE is the abbreviation for t-distributed Stochastic Neighbor Embedding. It was developed by Laurens van der Maaten and Geoffrey Hinton in their paper [4]. It is a dimensionality reduction technique used for visualizing high-dimensional data in a low-dimensional space. It preserves the local structure of the data while revealing the underlying global patterns. Let’s see the graph about the Swiss roll in Figure 7.13. If we just simply collapse the roll...