Creating separable encodings of images
In Figure 5.1, you can see an example of images from the CIFAR-10 dataset, along with an example of an early VAE algorithm that can generate fuzzy versions of these images based on a random number input:
Figure 5.1: CIFAR-10 sample (left), VAE (right)2
More recent work on VAE networks has allowed these models to generate much better images, as you will see later in this chapter. To start, let's revisit the problem of generating MNIST digits and how we can extend this approach to more complex data.
Recall from Chapter 1, An Introduction to Generative AI: "Drawing" Data from Models and Chapter 4, Teaching Networks to Generate Digits that the RBM (or DBN) model in essence involves learning the posterior probability distribution for images (x) given some latent "code" (z), represented by the hidden layer(s) of the network, the "marginal likelihood" of x:3
We can see z as being an ...