PCA is used to reduce the dimension of features for structured data. However, it is not designed to be aware of the target labels and application-specific metrics due to the nature of unsupervised learning. Note that traditional methods such as PCA often don't work as expected in more complicated examples such as natural language processing.
In this section, we are going to introduce a technique that embeds higher dimensional data, such as natural language or audio, into a fixed dimensional space, while keeping some semantics for processing.