Multimodal design patterns
With multimodal design patterns, we integrate different modalities, such as text, images, audio, and so on. With the multimodal models available, the ability to generate, manipulate, and understand images from text or other input modalities has become increasingly important in a wide range of applications, from creative design to scientific visualization and beyond.
Numerous patterns can be created with multimodal models. In this section, we are going to cover some of the common patterns.
Text-to-image
With a text-to-image pattern, you provide the text as a prompt to the model. The model will then generate an image based on that prompt, as shown in Figure 9.3.
Figure 9.3 – A text-to-image pattern
Parameters
At the core of image generation models are a set of customizable inference parameters and controls that allow users to get the desired image from the model. Let us look at these parameters:
- Negative...