Generating transcripts using the Audio endpoint
In this recipe, we will learn how to use the OpenAI API’s Audio endpoint, which converts audio into text. This enables developers to create voice applications, such as voice agents and speech conversational bots.
Getting ready
This recipe will also use Postman, but the typical set of Headers that we use will need to be modified so that the HTTP client uses form data instead of typical JSON. In addition, we must have a sample audio file that we can use as an example to convert speech to text. Form data is a way to encode and send data as key-value pairs in HTTP requests instead of JSON-formatted strings. Form data is often used for uploading files.
After opening a new request in Postman, navigate to the Headers menu and delete the Content-Type application/json
entry. This will force Postman to default to the Content-Type
of the request based on what is passed in the request Body.
Next, we need an audio file. Any short...