Using Stable Diffusion XL
Stable Diffusion XL (SDXL) is a model from Stability AI. Slightly different compared to previous models, SDXL is designed to be a two-stage model. We will need the base model to generate an image and can leverage a second, refiner model to refine an image, as shown in Figure 6.1. The refiner model is optional:
Figure 6.1: SDXL, a two-model pipeline
Figure 6.1 shows that to generate images of the best quality from the SDXL model, we will need to use the base model to generate a raw image, output as a 128x128 latent, and then use the refiner model to enhance it.
Before trying out the SDXL model, please ensure you have at least 15 GB of VRAM, otherwise, you may see a CUDA out of memory
error right before the refiner model outputs the image. You can also use the optimization methods from Chapter 5, to build a custom pipeline to move the model out of VRAM whenever possible.
Here are the steps to load up an SDXL model:
-
...