Introduction
So far, we have described a number of different methods for reducing the dimensionality of a dataset as a means of cleaning the data, reducing its size for computational efficiency, or for extracting the most important information available within the dataset. While we have demonstrated many methods for reducing high-dimensional datasets, in many cases, we are unable to reduce the number of dimensions to a size that can be visualized, that is, two or three dimensions, without excessively degrading the quality of the data. Consider the MNIST dataset that we used earlier in this book, which was a collection of digitized handwritten digits of the numbers 0 through 9. Each image is 28 x 28 pixels in size, providing 784 individual dimensions or features. If we were to reduce these 784 dimensions down to 2 or 3 for visualization purposes, we would lose almost all the available information.
In this chapter, we will discuss SNE and t-SNE as means of visualizing high-dimensional...