Using the Stable Diffusion model to generate images from text
The diffusers
library provides several pre-trained diffusion-based text-to-image generation models. One such model is Stable Diffusion V1.5. In this section, we’ll use this model to generate a high-quality image with a few lines of code. All the code for this section is available on GitHub [17].
First, we load the Stable Diffusion model with the following lines of code:
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, variant="fp16"
)
pipeline = pipeline.to("cuda")
The above code defines a DDPM text-to-image pipeline. You can access the underlying conditional UNet model with the following line of code:
pipeline.unet
This should produce the following output:
UNet2DConditionModel(
(conv_in): Conv2d(4, 320, kernel_size=(3, 3), stride...