Summary
Stable Diffusion has transcended the boundaries of classical AI imagery. Introducing creative freedom (“noise”) through diffusion in a latent space has opened the doors to huge generative computer vision possibilities.
We began the chapter by going through the Stable Diffusion process with a thought experiment and then with the talented Keras implementation. We went through the encoding of a contextualized input text, introducing a “noisy” (open to creativity) image patch, applying diffusion to this image to reduce (downsampling) it to a lower dimension, and then upsampling it to a 512x512 image patch. The output was astonishing for such a compact source code.
We then ran a Stability AI text-to-image notebook that also generated surprising images. We once again saw that diffusion is taking us to levels we never would have imagined including divergent association tasks.
Stability AI also provided a text-to-animation API to transform one...