In the previous section, you technically saw a sequence-to-vector model, which took a sequence (of numbers representing words) and mapped to a vector (of one dimension corresponding to a movie review). However, to appreciate these models further, we will move back to MNIST as the source of input to build a model that will take one MNIST numeral and map it to a latent vector.
Unsupervised model
Let's work in the autoencoder architecture shown in the following diagram. We have studied autoencoders before and now we will use them again since we learned that they are powerful in finding vectorial representations (latent spaces) that are robust and driven by unsupervised learning:
The goal here is to take an image and find its latent representation, which, in the example of Figure 13.10, would be two dimensions. However, you might be wondering: how can an image be a sequence?
We can interpret an image...