Harnessing ChatGPT and GPT-3

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

This article is an excerpt from the book, Natural Language Understanding with Python, by Deborah A. Dahl. Combine natural language technology, deep learning, and large language models to create human-like language comprehension in computer systems

Introduction

In the world of artificial intelligence, ChatGPT stands as a versatile conversational agent, adept at handling generic information interactions. While customization can be a challenge at present, ChatGPT offers a unique avenue for developers and AI enthusiasts alike. Beyond chat-based dialogue, it holds the potential to streamline the often time-consuming process of generating training data for conventional applications. In this article, we delve into the capabilities of ChatGPT and explore the journey of fine-tuning GPT-3 for specific use cases. By the end, you'll be equipped to harness the power of these language models, from data generation to AI customization, in your projects. Let's embark on this exciting AI journey together.

ChatGPT

ChatGPT (https://openai.com/blog/chatgpt/) is a system that can interact with users about generic information in a very capable way. Although at the time of writing, it is hard to customize ChatGPT for specific applications, it can be useful for other purposes than customized natural language applications. For example, it can very easily be used to generate training data for a conventional application. If we wanted to develop a banking application using some of the techniques discussed earlier in this book, we would need training data to provide the system with examples of how users might ask the system questions. Typically, this involves a process of collecting actual user input, which could be very time-consuming. ChatGPT could be used to generate training data instead, by simply asking it for examples. For example, for the prompt give me 10 examples of how someone might ask for their checking balance, ChatGPT responded with the sentences in Figure 11.3:

harnessing-chatgpt-and-gpt-3-img-0

Figure 11.3 – GPT-3 generated training data for a banking application

Most of these seem like pretty reasonable queries about a checking account, but some of them don’t seem very natural. For that reason, data generated in this way always needs to be reviewed. For example, a developer might decide not to include the second to the last example in a training set because it sounds stilted, but overall, this technique has the potential to save developers quite a bit of time.

Applying GPT-3

Another well-known LLM, GPT-3, can also be fine-tuned with application-specific data, which should result in better performance. To do this, you need an OpenAI key because using GPT-3 is a paid service. Both fine-tuning to prepare the model and using the fine-tuned model to process new data at inference time will incur a cost, so it is important to verify that the training process is performing as expected before training with a large dataset and incurring the associated expense.

OpenAI recommends the following steps to fine-tune a GPT-3 model.

1. Sign up for an account at https://openai.com/ and obtain an API key. The API key will be used to track your usage and charge your account accordingly.

2. Install the OpenAI command-line interface (CLI) with the following command:

! pip install --upgrade openai

This command can be used at a terminal prompt in Unix-like systems (some developers have reported problems with Windows or macOS). Alternatively, you can install GPT-3 to be used in a Jupyter notebook with the following code:

!pip install --upgrade openai

All of the following examples assume that the code is running in a Jupyter notebook:

1. Set your API key:

api_key =<your API key>
openai.api_key = api_key

2. The next step is to specify the training data that you will use for fine-tuning GPT-3 for your application. This is very similar to the process of training any NLP system; however, GPT-3 has a specific format that must be used for training data. This format uses a syntax called JSONL, where every line is an independent JSON expression. For example, if we want to fine-tune GPT-3 to classify movie reviews, a couple of data items would look like the following (omitting some of the text for clarity):

{"prompt":"this film is extraordinarily horrendous and i'm not going to waste any more words on it . ","completion":" 
negative"}
{"prompt":"9 : its pathetic attempt at \" improving \" on a shakespeare classic . 8 : its just another piece of teen fluff . 7 : kids in high school are not that witty . … ","completion":" 
negative"}
{"prompt":"claire danes , giovanni ribisi , and omar epps make a likable trio of protagonists , …","completion":" negative"}

Each item consists of a JSON dict with two keys, prompt and completion. prompt is the text to be classified, and completion is the correct classification. All three of these items are negative reviews, so the completions are all marked as negative.

It might not always be convenient to get your data into this format if it is already in another format, but OpenAI provides a useful tool for converting other formats into JSONL. It accepts a wide range of input formats, such as CSV, TSV, XLSX, and JSON, with the only requirement for the input being that it contains two columns with prompt and completion headers. Table 11.2 shows a few cells from an Excel spreadsheet with some movie reviews as an example:

prompt Unlock access to the largest independent learning library in Tech for FREE! Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of. Renews at AU $24.99/month. Cancel anytime	completion
kolya is one of the richest films i’ve seen in some time . zdenek sverak plays a confirmed old bachelor ( who’s likely to remain so ) , who finds his life as a czech cellist increasingly impacted by the five-year old boy that he’s taking care of …	positive
this three hour movie opens up with a view of singer/guitar player/musician/ composer frank zappa rehearsing with his fellow band members . all the rest displays a compilation of footage , mostly from the concert at the palladium in new york city , halloween 1979 …	positive
`strange days’ chronicles the last two days of 1999 in los angeles . as the locals gear up for the new millenium , lenny nero ( ralph fiennes ) goes about his business …	positive

Table 11.2 – Movie review data for fine-tuning GPT-3

To convert one of these alternative formats into JSONL, you can use the fine_tunes.prepare_ data tool, as shown here, assuming that your data is contained in the movies.csv file:

!openai tools fine_tunes.prepare_data -f ./movies.csv -q

The fine_tunes.prepare_data utility will create a JSONL file of the data and will also provide some diagnostic information that can help improve the data. The most important diagnostic that it provides is whether or not the amount of data is sufficient. OpenAI recommends several hundred examples of good performance. Other diagnostics include various types of formatting information such as separators between the prompts and the completions.

After the data is correctly formatted, you can upload it to your OpenAI account and save the filename:

file_name = "./movies_prepared.jsonl"
upload_response = openai.File.create(
 file=open(file_name, "rb"),
 purpose='fine-tune'
)
file_id = upload_response.id

The next step is to create and save a fine-tuned model. There are several different OpenAI models that can be used. The one we’re using here, ada, is the fastest and least expensive, and does a good job on many classification tasks:

openai.FineTune.create(training_file=file_id, model="ada")
fine_tuned_model = fine_tune_response.fine_tuned_model

Finally, we can test the model with a new prompt:

answer = openai.Completion.create(
 model = fine_tuned_model,
   engine = "ada",
 prompt = " I don't like this movie ",
 max_tokens = 10, # Change amount of tokens for longer completion
 temperature = 0
)
answer['choices'][0]['text']

In this example, since we are only using a few fine-tuning utterances, the results will not be very good. You are encouraged to experiment with larger amounts of training data.

Conclusion

In conclusion, ChatGPT and GPT-3 offer invaluable tools for AI enthusiasts and developers alike. From data generation to fine-tuning for specific applications, these models present a world of possibilities. As we've seen, ChatGPT can expedite the process of creating training data, while GPT-3's customization can elevate the performance of your AI applications. As the field of artificial intelligence continues to evolve, these models hold immense promise. So, whether you're looking to streamline your development process or take your AI solutions to the next level, the journey with ChatGPT and GPT-3 is an exciting one filled with untapped potential. Embrace the future of AI with confidence and innovation.

Author Bio

Deborah A. Dahl is the principal at Conversational Technologies, with over 30 years of experience in natural language understanding technology. She has developed numerous natural language processing systems for research, commercial, and government applications, including a system for NASA, and speech and natural language components on Android. She has taught over 20 workshops on natural language processing, consulted on many natural language processing applications for her customers, and written over 75 technical papers. This is Deborah’s fourth book on natural language understanding topics. Deborah has a PhD in linguistics from the University of Minnesota and postdoctoral studies in cognitive science from the University of Pennsylvania.

Harnessing ChatGPT and GPT-3

Introduction

ChatGPT

Applying GPT-3

Conclusion

Author Bio

Recommendations for you

Comments (0)

No comments for this article yet!