Fine-Tuning GPT 3.5 and 4

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!

Introduction

Fine-tuning with OpenAI is a new feature that might become a crucial aspect of enhancing AI language models for specific tasks and contexts. It holds significant importance as it allows these models to be adapted to perform tasks beyond their initial capabilities in a different way that could be done just with Prompt Engineering. In this article, we will use traditional fine-tuning, which involves training a model on a specialized dataset. The dataset we will be using consists of conversations formatted in a JSON lines structure, where each exchange is a sequence of chat message dictionaries. Each dictionary includes role assignments (system, user, or assistant) and the corresponding content of the message. This approach aims to adapt the model to better understand and generate human-like conversations. Let’s start by taking a look at the different alternatives to adapt a Large Language Model for custom tasks.

Fine-tuning versus Prompt Engineering

There are two distinct approaches for adapting a model to work with custom data: prompt engineering and traditional fine-tuning. While both methods aim to customize LLMs for specific tasks, they differ in their approaches and objectives.

Prompt engineering entails crafting precise input prompts to guide the AI's responses effectively. It involves tailoring the prompts to elicit desired outcomes from the AI. This technique requires developers to experiment with different prompts, instructions, and formats to achieve precise control over the model's behavior. By providing explicit instructions within prompts, developers can elicit specific answers for tasks like code generation or translation. Prompt engineering is particularly valuable when clear guidance is essential, but finding the optimal prompts might require iterative testing.

On the other hand, fine-tuning focuses on adapting a pre-trained LLM to perform better on a particular task or context. This process involves training the model on custom datasets that align with the desired application. Fine-tuning allows LLMs to develop a deeper understanding of context and language nuances, making them more adaptable to diverse prompts and human-like conversations. While it offers less direct control compared to prompt engineering, fine-tuning improves the model's ability to generate coherent responses across a broader range of scenarios.

In essence, prompt engineering emphasizes precision and specific instruction within prompts, while fine-tuning enhances the LLM's adaptability and comprehension of context. Both prompt engineering and traditional fine-tuning serve as techniques to enhance the AI's conversational abilities. Prompt engineering emphasizes precise instruction, while traditional fine-tuning focuses on training the model to comprehend and generate conversations more effectively.

Looking at the Training Data

Before training a model, we need to understand the required data format for the OpenAI fine-tuning endpoints. This format utilizes JSON lines and consists of a primary key "messages," followed by an array of dictionaries representing chat messages. These dictionaries collectively form a complete conversation.

The expected structure to train an Open AI model looks like this:

{"messages": [{"role": "system", "content": "..."}, ...]}
{"messages": [{"role": "system", "content": "..."}, ...]}
{"messages": [{"role": "system", "content": "..."}, ...]}
{"messages": [{"role": "system", "content": "..."}, ...]}

Each chat message dictionary includes two essential components:

The "role" field: This identifies the source of the message, which can be system, user, or assistant. It indicates the origin of the message.
The "content" field: This contains the actual textual content of the message.

In this article, we will be using an already available training dataset that complies with this structure within the Hugging Face datasets repository.

Before we get this data, let’s first install the datasets package alongside the open ai and langchain modules using pip.

!pip install datasets==2.14.4 openai==0.27.9 langchain==0.0.274

Next, we can download the dataset using the datasets library and write it into a JSON file.

from datasets import load_dataset  #  

data = load_dataset(
   "jamescalam/agent-conversations-retrieval-tool",
   split="train"
)
data.to_json("conversations.jsonl")
To verify the structure of the file, we open it and load it into separate conversations.
import json

with open('conversations.jsonl', 'r') as f:
   conversations = f.readlines()

# Assuming each line is a JSON string, you can iterate through the lines and load each JSON string
parsed_conversations = [json.loads(line) for line in conversations]
len(parsed_conversations)

We get 270 conversations, and if we want, we can inspect the first element of the list.

parsed_conversations[0]

fine-tuning-gpt-35-and-4-img-0

In the following code snippet, the OpenAI Python library is imported, and the API key is set using the environment variable. The script then uses the OpenAI API to create a fine-tuning job for GPT-3.5 Turbo. It reads the contents of a JSON Lines file named conversations.jsonl and sets the purpose of the file as 'fine-tune'. The resulting file ID is saved for later use.

import openai
import os

# Set up environment variables for API keys
os.environ['OPENAI_API_KEY'] = 'your-key'

res = openai.File.create(
   file=open("conversations.jsonl", "r"),
   purpose='fine-tune'
)
# We save the file ID for later
file_id = res["id"]

Now we can start the Fine-tuning job.
res = openai.FineTuningJob.create(
   training_file=file_id,
   model="gpt-3.5-turbo"
)
job_id = res["id"]

In this part of the code, the fine-tuning job is initiated by calling the Openai.FineTuningJob.create() function. The training data file ID obtained earlier is passed as the training_file parameter, and the model to be fine-tuned is specified as "gpt-3.5-turbo". The resulting job ID is saved for monitoring the fine-tuning progress.

Monitoring Fine-Tuning Progress

from time import sleep 
 
while True: 
    print('*'*50) 
    res = openai.FineTuningJob.retrieve(job_id) 
    print(res) 
    if res["finished_at"] != None: 
        ft_model = res["fine_tuned_model"] 
        print('Model trained, id:',ft_model) 
        break 
    else: 
        print("Job still not finished, sleeping") 
        sleep(60)

. fine-tuning-gpt-35-and-4-img-1

In this section, the code enters a loop to continuously check the status of the fine-tuning job using the openai.FineTuningJob.retrieve() method. If the job has finished indicated by the "finished_at" field in the response, the ID of the fine-tuned model is extracted and printed. Otherwise, if the job is not finished yet, the script pauses or waits for a minute using the "sleep(60)" function before checking the job status again.

Using the Fine-Tuned Model for Chat

from langchain.chat_models import ChatOpenAI 
from langchain.prompts.chat import ( 
    ChatPromptTemplate, 
    SystemMessagePromptTemplate, 
    AIMessagePromptTemplate, 
    HumanMessagePromptTemplate, 
) 
from langchain.schema import AIMessage, HumanMessage, SystemMessage 

chat = ChatOpenAI( 
    temperature=0.5, 
    model_name=ft_model 
) 

messages = [ 
    SystemMessage( 
        content="You are a helpful assistant." 
    ), 
    HumanMessage( 
        content="tell me about Large Language Models" 
    ), 
] 
chat(messages)

fine-tuning-gpt-35-and-4-img-2

In this last part of the code, the fine-tuned model is integrated into a chat using the LangChain library. A ChatOpenAI instance is created with specified settings, including a temperature of 0.5 and the name of the fine-tuned model (ft_model). A conversation is then simulated using a sequence of messages, including a system message and a human message. The chat interaction is executed using the chat() method.

The provided code is a step-by-step guide to set up, fine-tune, monitor, and utilize a chatbot model using OpenAI's API and the LangChain library. It showcases the process of creating, training, and interacting with a fine-tuned model for chat applications.

Conclusion

In conclusion, fine-tuning GPT-3.5 and GPT-4 marks a significant leap in customizing AI language models for diverse applications. Whether you opt for precise prompt engineering or traditional fine-tuning, both approaches offer unique strategies to enhance conversational abilities. This step-by-step article demonstrates how to prepare data, initiate fine-tuning, monitor progress, and leverage the fine-tuned model for chat applications.

As AI evolves, fine-tuning empowers language models with specialized capabilities, driving innovation across various fields. Developers can harness these techniques to excel in tasks ranging from customer support to complex problem-solving. With the power of fine-tuning at your disposal, the possibilities for AI-driven solutions are limitless, promising a brighter future for AI technology.

Author Bio

Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.