A typical image is comprised thousands of pixels; text is also comprised thousands of unique words, and the number of distinct customers of a company could be in the millions. Given this, all three—user, text, and images—would have to be represented as a vector in thousands of dimensional planes. The drawback of representing a vector in such a high dimensional space is that we will not able to calculate the similarity of vectors efficiently.
Representing an image, text, or user in a lower dimension helps us in grouping entities that are very similar. Encoding is a way to perform unsupervised learning to represent an input in a lower dimension with minimal loss of information while retaining the information about images that are similar.
In this chapter, we will be learning about the following:
- Encoding an image to a much a lower dimension
- Vanilla autoencoder...