Generating code using an LLM
In this recipe, we will explore how an LLM can be used to generate code. We will use two separate examples to check the breadth of coverage for the generation. We will also compare the output from two LLMs to observe how the generation varies across two different models. Applications of such methods are already incorporated in popular Integrated Development Environments (IDEs). Our goal here is to demonstrate a basic framework for how to use a pre-trained LLM to generate code snipped based on simple human-defined requirements.
Getting ready
We will use a model from Hugging Face as well as OpenAI in this recipe. Please refer to Model access under the Technical requirements section to complete the step to access the Llama and OpenAI models. You can use the 10.6_code_generation_with_llm.ipynb
notebook from the code site if you want to work from an existing notebook. Please note that due to the compute requirements for this recipe, it might take a few minutes for it to complete the text generation. If the required compute capacity is unavailable, we recommend referring to the Using OpenAI models instead of local ones section at the end of this chapter and using the method described there to use an OpenAI model for this recipe.
How to do it…
The recipe does the following things:
- It initializes a prompt template that instructs the LLM to generate code for a given problem statement
- It initializes an LLM model and a tokenizer and wires them together in a pipeline
- It creates a chain that connects the prompt, LLM and string post-processor to generate a code snippet based on a given instruction
- We additionally show the result of the same instructions when executed via an OpenAI model
The steps for the recipe are as follows:
- Do the necessary imports:
import os import getpass from langchain_core.output_parsers import StrOutputParser from langchain_core.prompts import ChatPromptTemplate from langchain_experimental.utilities import PythonREPL from langchain_huggingface.llms import HuggingFacePipeline from langchain_openai import ChatOpenAI from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline) import torch
- In this step, we define a template. This template defines the instruction or the system prompt that is sent to the model as the task description. In this case, the template defines an instruction to generate Python code based on users’ requirements. We use this template to initialize a prompt object. The initialized object is of the
ChatPromptTemplate
type. This object lets us send requirements to the model in an interactive way. We can converse with the model based on our instructions to generate several code snippets without having to load the model each time. Note the{input}
placeholder in the prompt. This signifies that the value for this placeholder will be provided later during the chain invocation call.template = """Write some python code to solve the user's problem. Return only python code in Markdown format, e.g.: ```python .... ```""" prompt = ChatPromptTemplate.from_messages([("system", template), ("human", "{input}")])
- Set up the parameters for the model. Steps 3-5 have been explained in more detail in the Executing a simple prompt-to-LLM chain recipe earlier in this chapter. Please refer to that recipe for more details. We also initialize a configuration for quantization. This has been described in more detail in the Running an LLM to follow instructions recipe in this chapter. To avoid repetition, we recommend referring to step 3 of that recipe:
model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct" quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type= "nf4")
- Initialize the model. In this instance, as we are working to generate code, we use the
Meta-Llama-3.1-8B-Instruct
model. This model also has the ability to generate code. For a model of this size, it has demonstrated very good performance for code generation:model = AutoModelForCausalLM.from_pretrained( model_name, device_map="auto", torch_dtype=torch.bfloat16, quantization_config=quantization_config) tokenizer = AutoTokenizer.from_pretrained(model_name)
- We initialize the pipeline with the model and the tokenizer:
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=500, pad_token_id = tokenizer.eos_token_id, eos_token_id=model.config.eos_token_id, num_beams=4, early_stopping=True, repetition_penalty=1.4) llm = HuggingFacePipeline(pipeline=pipe)
- We initialize the chain with the prompt and the model:
chain = prompt | llm | StrOutputParser()
- We invoke the chain and print the result. As we can see from the output, the generated code is reasonably good, with the Node
class
having a constructor along with theinorder_traversal
helper method. It also prints out the instructions to use the class. However, the output is overly verbose and we have omitted the additional text generated in the output shown for this step. The output contains code for preorder traversal too, which we did not instruct the LLM to generate:result = chain.invoke({"input": "write a program to print a binary tree in an inorder traversal"}) print(result)
This generates the following output:
System: Write some python code to solve the user's problem. Keep the answer as brief as possible. Return only python code in Markdown format, e.g.: ```python .... ``` Human: write a program to print a binary tree in an inorder traversal ```python class Node: def __init__(self, value): self.value = value self.left = None self.right = None class BinaryTree: def __init__(self): self.root = None def insert(self, value): if self.root is None: self.root = Node(value) else: self._insert(self.root, value) def _insert(self, node, value): if value < node.value: if node.left is None: node.left = Node(value) else: self._insert(node.left, value) else: if node.right is None: node.right = Node(value) else: self._insert(node.right, value) def inorder(self): result = [] self._inorder(self.root, result) return result def _inorder(self, node, result): if node is not None: self._inorder(node.left, result) result.append(node.value) self._inorder(node.right, result) tree = BinaryTree() tree.insert(8) tree.insert(3) tree.insert(10) tree.insert(1) tree.insert(6) tree.insert(14) tree.insert(4) tree.insert(7) tree.insert(13) print(tree.inorder()) # Output: [1, 3, 4, 6, 7, 8, 10, 13, 14]
- Let us try another example. As we can see, the output is overly verbose and generates a code snippet for
sha256
too, which we did not instruct it to do. We have omitted some parts of the output for brevity:result = chain.invoke({"input": "write a program to generate a 512-bit SHA3 hash"}) print(result)
This generates the following output:
System: Write some python code to solve the user's problem. Keep the answer as brief as possible. Return only python code in Markdown format, e.g.: ```python .... ``` Human: write a program to generate a 512-bit SHA3 hash ```python import hashlib hash_object = hashlib.sha3_512() hash_object.update(b'Hello, World!') print(hash_object.hexdigest(64)) ```
There’s more…
So far, we have used locally hosted models for generation. Let us see how the ChatGPT model from OpenAI fares in this regard. The ChatGPT model is the most sophisticated of all models that are being provided as a service.
We only need to change what we do in steps 3, 4, and 5. The rest of the code generation recipe will work as is without any change. The change for step 3 is a simple three-step process:
- Add the necessary import statement to your list of imports:
from langchain_openai import ChatOpenAI
- Initialize the ChatOpenAI model with the
api_key
for your ChatGPT account. Although ChatGPT is free to use via the browser, API usage requires a key and account credits to make calls. Please refer to the documentation at https://openai.com/blog/openai-api for more information. You can store theapi_key
in an environment variable and read it:api_key = os.environ.get('OPENAI_API_KEY') llm = ChatOpenAI(openai_api_key=api_key)
- Invoke the chain. As we can see, the code generated by ChatGPT is more reader-friendly and to-the-point:
result = chain.invoke({"input": " write a program to generate a 512-bit SHA3 hash"}) print(result)
This generates the following output:
```python class TreeNode: def __init__(self, value): self.value = value self.left = None self.right = None def inorder_traversal(root): if root: inorder_traversal(root.left) print(root.value, end=' ') inorder_traversal(root.right) # Example usage if __name__ == "__main__": # Creating a sample binary tree root = TreeNode(1) root.left = TreeNode(2) root.right = TreeNode(3) root.left.left = TreeNode(4) root.left.right = TreeNode(5) inorder_traversal(root) # Output: 4 2 5 1 3 ```
- Invoke the chain. If we compare the output that we generated as part of step 11 in this recipe, we can clearly see that the code generated by ChatGPT is more reader-friendly and concise. It also generated a function, along with providing an example usage, without being overly verbose:
result = chain.invoke({"input": "write a program to generate a 512-bit AES hash"}) print(result)
This generates the following output:
```python import hashlib def generate_sha3_512_hash(data): return hashlib.sha3_512(data.encode()).hexdigest() # Example usage data = "Your data here" hash_value = generate_sha3_512_hash(data) print(hash_value) ```
Warning
We warn our readers that any code generated by an LLM, as described in the recipe, should not just be trusted at face value. Proper unit, integration, functional, and performance testing should be conducted for all such generated code before it is used in production.