Stochastic Neighbor Embedding (SNE)
Stochastic Neighbor Embedding (SNE) is one of a number of different methods that fall within the category of manifold learning, which aims to describe high-dimensional spaces within low-dimensional manifolds or bounded areas. At first thought, this seems like an impossible task; how can we reasonably represent data in two dimensions if we have a dataset with at least 30 features? As we work through the derivation of SNE, it is hoped that you will see how it is possible. Don't worry, we will not be covering the mathematical details of this process in great depth as it is outside of the scope of this chapter. Constructing an SNE can be divided into the following steps:
Convert the distances between datapoints in the high-dimensional space to conditional probabilities. Say we had two points, and , in high-dimensional space, and we wanted to determine the probability () that would be picked as a neighbor of . To define this probability, we use a Gaussian...