Getting started with the DALL-E 2 API
DALL-E’s text-to-image functionality has dramatically improved in a short time. Transformers are task-agnostic and can perform a variety of tasks. Transformers are now also multimodal (audio, image, other signals).
We previously went through DALL-E’s architecture in the DALL-E section of Chapter 15, From NLP to Task-Agnostic Transformer Models.
You can try DALL-E 2 online: https://openai.com/product/dall-e-2.
However, this section goes further using the second-generation API of the DALL-E 2 API so that we can write a program. The API allows us to create, modify, and generate variations of an image.
Open Getting_Started_with_the_DALL_E_2_API.ipynb
.
You can implement the DALL-E 2 API as you see fit for your project.
The notebook presents one way of running the DALL-E 2 API.
You can run it cell by cell to grasp the functionality provided or run the whole notebook for one scenario.
This section is divided...