SDXL Turbo
Much like Stable Diffusion, a model called SDXL (Stable Diffusion Extra Large) has been trained that returns HD images that have dimensions of 1,024x1,024 . Due to its large size, as well as the number of denoising steps, SDXL takes considerable time to generate images over increasing time steps. How do we reduce the time it takes to generate images while maintaining the consistency of images? SDXL Turbo comes to the rescue here.
Architecture
SDXL Turbo is trained by performing the following steps:
- Sample an image and the corresponding text from a pre-trained dataset (the Large-scale Artificial Intelligence Open Network (LAION) available at https://laion.ai/blog/laion-400-open-dataset/).
- Add noise to the original image (the chosen time step can be a random number between 1 and 1,000)
- Train the student model (the Adversarial diffusion model) to generate images that can fool a discriminator.
- Further, train the student model in such a way...