Interactive Email Phishing Training with ChatGPT

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

This article is an excerpt from the book, ChatGPT for Cybersecurity Cookbook, by Clint Bodungen. Master ChatGPT and the OpenAI API, and harness the power of cutting-edge generative AI and large language models to revolutionize the way you perform penetration testing, threat detection, and risk assessment.

Introduction

With the rise of cyber threats, organizations of all sizes are increasingly aware of the importance of training their staff on email phishing, a common and potentially dangerous tactic employed by cybercriminals. In this recipe, we'll be using ChatGPT to create a tool for interactive email phishing training.

This recipe guides you through the process of crafting a specialized prompt to turn ChatGPT into a simulation tool for phishing attack awareness. With this approach, you can use ChatGPT to train users to identify potential phishing emails, thereby increasing their awareness and helping to protect your organization from potential security threats.

What makes this truly powerful is its interactive nature. ChatGPT will present the user with a series of email scenarios. The user will then decide whether the email is a phishing attempt or a legitimate email, and can even ask for more details such as the URL to a link in the email or header information, for example. ChatGPT will provide feedback, ensuring a continuous, engaging, and efficient learning experience.

Additionally, we will also cover how to use Python in conjunction with these prompts to create exportable email simulation scenarios. This feature can be beneficial in situations where you might want to use the generated scenarios outside of ChatGPT, such as in a live course or in a Learning Management System (LMS).

Getting ready

Before diving into this recipe, ensure you have your OpenAI account set up and your API key on hand. If not, you should refer back to Chapter 1 for the necessary setup details. You will also need Python version 3.10.x or later.

Additionally, confirm you have the following Python libraries installed:

1. openai: This library enables you to interact with the OpenAI API. Install it using the command pip install openai.

2. os: This is a built-in Python library, which allows you to interact with the operating system, especially for accessing environment variables.

3. tqdm: This library is utilized for showing progress bars during the policy generation process. Install it with pip install tqdm.

How to do it…

In this section, we will walk you through the process of creating an interactive email phishing training simulation using ChatGPT. The instructions are step-by-step, starting from logging into your OpenAI account and ending with generating phishing training simulations.

1. Access the ChatGPT interface. Log into your OpenAI account and go to the ChatGPT interface at https://chat.openai.com.

2.Initialize the simulation by entering the specialized prompt. The following prompt is carefully designed to instruct ChatGPT to act as a phishing training simulator. Enter the prompt into the text box and press Enter.

"You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience. Create an interactive email phishing training simulation (for employees). Provide no other response other than to ask the question, "Is the following email real or a phishing attempt? (You may ask clarification questions such as URL information, header information, etc.)" followed by simulated email, using markdown language formatting. The email you present can represent a legitimate email or a phishing attempt, which can use one or more various techniques. Provide no further generation or response until I answer the question. If I answer correctly, just respond with "Correct" and a short description to further explain the answer, and then restart the process from the beginning. If I answer incorrectly, respond with "Incorrect", then the correct answer, then a short description to further explain the answer. Then repeat the process from the beginning.

Present me with only 3 simulations in total throughout the process and remember my answer to them all. At least one of the simulations should simulate a real email. After the last question has been answered, and after your response, end the assessment and give me my total score, the areas I did well in and where I need to improve."

Tip
Be sure to change the number of simulations ChatGPT provides, to suit your needs.

Now, ChatGPT will generate interactive email phishing scenarios based on your instructions. Respond to each scenario as if you were the employee undergoing the training. After the third scenario and your final response, ChatGPT will calculate and provide your total score, areas of strength, and areas for improvement.

How it works…

At the heart of this recipe lies the specialized prompt. This prompt is constructed to instruct ChatGPT to act as an interactive phishing training tool, delivering a series of email phishing scenarios. The prompt follows certain design principles which are essential to its effectiveness and interaction with the OpenAI models. Here, we'll dissect those principles:

1. Defining the role: The prompt starts by setting up the role of the AI model – a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience. By defining the AI's persona, we direct the model to generate responses using the knowledge and expertise expected from such a role.

2. Detailed instructions and simulation: The instructions given in the prompt are meticulously detailed, and it is this precision that enables ChatGPT to create effective and realistic phishing simulations. The prompt asks the AI model to generate a phishing email scenario, followed by the question, "Is the following email real or a phishing attempt?". Notably, the AI model is given the liberty to provide additional clarifying questions, such as asking about URL information, header information, etc., giving it the freedom to generate more nuanced and complex scenarios.

By asking the model to generate these emails using markdown language formatting, we ensure that the simulated emails have the structure and appearance of genuine emails, enhancing the realism of the simulation. The AI is also instructed to present emails that can represent either legitimate correspondence or a phishing attempt, ensuring a diverse range of scenarios for the user to evaluate.

How can ChatGPT convincingly simulate phishing emails? Well, ChatGPT's strength comes from the wide variety of text it has been trained on, including (but not limited to) countless examples of email correspondences and probably some instances of phishing attempts or discussions around them. From this extensive training, the model has developed a robust understanding of the format, tone, and common phrases used in both legitimate and phishing emails. So, when prompted to simulate a phishing email, it can draw on this knowledge to generate a believable email that mirrors the features of a real-world phishing attempt.

As the model doesn't generate responses until it receives an answer to its question, it guarantees an interactive user experience. Based on the user's response, the model provides relevant feedback ("Correct" or "Incorrect"), the correct answer if the user was wrong, and a brief explanation. This detailed, immediate feedback aids the learning process and helps to embed the knowledge gained from each simulated scenario.

It's worth noting that, although the model has been trained to generate human-like text, it doesn't understand the content in the same way humans do. It doesn't have beliefs, opinions, or access to real-time, world-specific information or personal data unless explicitly provided in the conversation. Its responses are merely predictions based on its training data. The carefully designed prompt and structure are what guide the model to generate useful, contextually appropriate content for this particular task.

3. Feedback mechanism: The prompt instructs the AI to provide feedback based on the user's answer, further explaining the answer. This creates an iterative feedback loop that enhances the learning experience.

4. Keeping track of progress: The prompt instructs the AI to present three simulations in total and to remember the user's answer to all of them. This ensures continuity in the training and enables tracking of the user's progress.

5. Scoring and areas of improvement: After the final simulation and response, the prompt instructs the AI to end the assessment and provide a total score along with areas of strength and areas for improvement. This helps the user understand their proficiency and the areas they need to focus on for improvement.

ChatGPT’s models are trained on a broad range of internet text. However, it's essential to note that it does not know specifics about which documents were part of its training set or have access to any private, confidential, or proprietary information. It generates responses to prompts by recognizing patterns and producing text that statistically aligns with the patterns it observed in its training data.

By structuring our prompt in a way that clearly defines the interactive assessment context and expected behavior, we're able to leverage this pattern recognition to create a highly specialized interactive tool. The ability of the OpenAI models to handle such a complex and interactive use case demonstrates their powerful capability and flexibility.

There’s more…

If you're using a Learning Management System (LMS) or conducting a live class, you might prefer to have a list of scenarios and details rather than an interactive method like ChatGPT. In these settings, it's often more practical to provide learners with specific scenarios to ponder and discuss in a group setting. The list can also be used for assessments or training materials, offering a static reference point that learners can revisit as needed, or as content for a phishing simulation system.

By modifying the script from the previous recipe, you can instruct the ChatGPT model to produce a set of phishing email simulations complete with all necessary details. The resulting text can be saved into a file for easy distribution and usage in your training environment.

Since this script is so similar to the one from the previous recipe, we’ll just cover the modifications instead of steppping through the entire script again.

Let's walk through the necessary modifications:

1. Rename and modify the function: The function generate_question is renamed to generate_email_simulations, and its argument list and body are updated to reflect its new purpose. It will now generate the phishing email simulations instead of cybersecurity awareness questions. This is done by updating the messages that are passed to the OpenAI API within this function.

def generate_email_simulations() -> str: 
    # Define the conversation messages 
    messages = [ 
        {"role": "system", "content": 'You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience.'}, 
        {"role": "user", "content": 'Create a list of fictitious emails for an interactive email phishing training. The emails can represent a legitimate email or a phishing attempt, using one or more various techniques. After each email, provide the answer, contextual descriptions, and details for any other relevant information such as the URL for any links in the email, header information. Generate all necessary information in the email and supporting details. Present 3 simulations in total. At least one of the simulations should simulate a real email.'}, 
    ] 
    ...

Note

You can adjust the number of scenarios here to suit your needs. In this example, we've requested 3 scenarios.

2. Remove unnecessary code: The script no longer reads content categories from an input file, as it's not required in your use case.

3. Update variable and function names: All variable and function names referring to "questions" or "assessment" have been renamed to refer to "email simulations" instead, to make the script more understandable in the context of its new purpose.

4. Call the appropriate function: The generate_email_simulations function is called instead of the generate_question function. This function initiates the process of generating the email simulations.

# Generate the email simulations 
email_simulations = generate_email_simulations()

Tip
Like the previous method, more scenarios will require a model that supports a larger context window. However, the gpt-4 model seems to provide better results in terms of accuracy, depth, and consistency with its generations for this recipe.

Here’s how the complete script should look:

import openai 
import os 
import threading 
import time 
from datetime import datetime 
 
# Set up the OpenAI API 
openai.api_key = os.getenv("OPENAI_API_KEY") 
 
current_datetime = datetime.now().strftime('%Y-%m-%d_%H-%M-%S') 
assessment_name = f"Email_Simulations_{current_datetime}.txt" 
 
def generate_email_simulations() -> str: 
    # Define the conversation messages 
    messages = [ 
        {"role": "system", "content": 'You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience.'}, 
        {"role": "user", "content": 'Create a list of fictitious emails for an interactive email phishing training. The emails can represent a legitimate email or a phishing attempt, using one or more various techniques. After each email, provide the answer, contextual descriptions, and details for any other relevant information such as the URL for any links in the email, header information. Generate all necessary information in the email and supporting details. Present 3 simulations in total. At least one of the simulations should simulate a real email.'}, 
    ] 
 
    # Call the OpenAI API 
    response = openai.ChatCompletion.create( 
        model="gpt-3.5-turbo", 
        messages=messages, 
        max_tokens=2048, 
        n=1, 
        stop=None, 
        temperature=0.7, 
    ) 
 
    # Return the generated text 
    return response['choices'][0]['message']['content'].strip() 
 
# Function to display elapsed time while waiting for the API call 
def display_elapsed_time(): 
    start_time = time.time() 
    while not api_call_completed: 
        elapsed_time = time.time() - start_time 
        print(f"\rElapsed time: {elapsed_time:.2f} seconds", end="") 
        time.sleep(1) 
 
api_call_completed = False 
elapsed_time_thread = threading.Thread(target=display_elapsed_time) 
elapsed_time_thread.start() 
 
# Generate the report using the OpenAI API 
try: 
    # Generate the email simulations 
    email_simulations = generate_email_simulations() 
except Exception as e: 
    print(f"\nAn error occurred during the API call: {e}") 
    exit() 
 
api_call_completed = True 
elapsed_time_thread.join() 
 
# Save the email simulations into a text file 
try: 
    with open(assessment_name, 'w') as file: 
        file.write(email_simulations) 
    print("\nEmail simulations generated successfully!") 
except Exception as e: 
    print(f"\nAn error occurred during the email simulations generation: {e}")

By running this modified script, the ChatGPT model is directed to generate a series of interactive email phishing training scenarios. The script then collects the generated scenarios, checks them for errors, and writes them to a text file. This gives you a ready-made training resource that you can distribute to your learners or incorporate into your LMS or live training sessions.

Conclusion

In conclusion, leveraging ChatGPT for interactive email phishing training empowers users with immersive, realistic simulations, bolstering cybersecurity awareness and defense. This innovative approach fosters a proactive stance against threats, ensuring organizations stay ahead in the ever-evolving landscape of cyber risks. With adaptable learning and dynamic feedback, ChatGPT transforms education, creating a robust line of defense against potential security breaches.

Author Bio

Clint Bodungen is a cybersecurity professional with 25+ years of experience and the author of Hacking Exposed: Industrial Control Systems. He began his career in the United States Air Force and has since many of the world's largest energy companies and organizations, working for notable cybersecurity companies such as Symantec, Kaspersky Lab, and Booz Allen Hamilton. He has published multiple articles, technical papers, and training courses on cybersecurity and aims to revolutionize cybersecurity education using computer gaming (“gamification”) and AI technology. His flagship product, ThreatGEN® Red vs. Blue, is the world’s first online multiplayer cybersecurity simulation game, designed to teach real-world cybersecurity.