Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Unlocking Data with Generative AI and RAG

You're reading from   Unlocking Data with Generative AI and RAG Enhance generative AI systems by integrating internal data with large language models using RAG

Arrow left icon
Product type Paperback
Published in Sep 2024
Publisher Packt
ISBN-13 9781835887905
Length 346 pages
Edition 1st Edition
Concepts
Arrow right icon
Author (1):
Arrow left icon
Keith Bourne Keith Bourne
Author Profile Icon Keith Bourne
Keith Bourne
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Part 1 – Introduction to Retrieval-Augmented Generation (RAG) FREE CHAPTER
2. Chapter 1: What Is Retrieval-Augmented Generation (RAG) 3. Chapter 2: Code Lab – An Entire RAG Pipeline 4. Chapter 3: Practical Applications of RAG 5. Chapter 4: Components of a RAG System 6. Chapter 5: Managing Security in RAG Applications 7. Part 2 – Components of RAG
8. Chapter 6: Interfacing with RAG and Gradio 9. Chapter 7: The Key Role Vectors and Vector Stores Play in RAG 10. Chapter 8: Similarity Searching with Vectors 11. Chapter 9: Evaluating RAG Quantitatively and with Visualizations 12. Chapter 10: Key RAG Components in LangChain 13. Chapter 11: Using LangChain to Get More from RAG 14. Part 3 – Implementing Advanced RAG
15. Chapter 12: Combining RAG with the Power of AI Agents and LangGraph 16. Chapter 13: Using Prompt Engineering to Improve RAG Efforts 17. Chapter 14: Advanced RAG-Related Techniques for Improving Results 18. Index 19. Other Books You May Enjoy

Code lab 5.1 – Securing your keys

This code can be found in the CHAPTER5-1_SECURING_YOUR_KEYS.ipynb file in the CHAPTER_05 directory of the GitHub repository.

In Chapter 2, we provided a coding step right after adding imports where we added your OpenAI API key. In that section, we indicated that it was a very simple demonstration of how the API key is ingested into the system, but this is not a secure way to use an API key. Typically, as your RAG application expands, you will have multiple API keys as well. But even if you only have the OpenAI API key, this is enough to institute further security measures to protect your key. This key can be used to run up expensive bills on your OpenAI account, exposing you to potential financial risk.

We are going to start this code lab with a very common security-driven practice of hiding your sensitive API code (and any other secret code) in a separate file that can be hidden from your versioning system. The most typical reason to implement this is when you are using a versioning system and you want to set up a file with your secrets separately that you list in the ignore file to prevent them from getting exposed, while still being able to use the secrets in the code for proper code execution.

This is the code provided previously for accessing your OpenAI API key:

# OpenAI Setup
os.environ['OPENAI_API_KEY'] = 'sk-###################'
openai.api_key = os.environ['OPENAI_API_KEY']

As mentioned, you are going to need to replace sk-################### with your actual OpenAI API key for the rest of your code to work. But wait, this is not a very secure way to do this! Let’s fix that!

First, let’s create the new file you will use to save your secrets. With the dotenv Python package, you can use .env out of the box. However, in some environments, you may run into system restrictions that prevent you from using a file starting with a dot (.). In those cases, you can still use dotenv, but you have to create a file, name it, and then point dotenv to it. For example, if I cannot use .env, I use env.txt, and that is the file where I store the OpenAI API key. Add the .env file you want to use to your environment and add the API key to the .env file like this:

OPENAI_API_KEY="sk-###################"

This will essentially just be a text file with that one line of code in it. It may not seem like much, but handling it this way protects that API key from getting spread across your versioning system, which makes it significantly less secure. As I mentioned in Chapter 2, you have to fill in your actual API key to replace the sk-################### part of the code.

If you are using Git for version control, add whatever the name of your file is to your gitignore file so that, when you commit it to Git, you do not push the file with all your secrets in it! In fact, this is a good time to generate a new OpenAI API key and delete the one you were just using, especially if you think it could show up in the history of your code prior to making the changes we are implementing in this chapter. Delete the old key and start fresh with a new key in your .env file, preventing any key from ever being exposed in your Git versioning system.

You can use this file for all keys and similar information you want to keep secret. So, for example, you could have multiple keys in your .env file, such as what you see here:

OPENAI_API_KEY="sk-###################"
DATABASE_PW="########"
LANGSMITH="###################"
AZUREOPENAIKEY="sk-###################"

This is an example that shows multiple keys that we want to keep secret and out of the hands of untrusted users. If there is still a security breach, you can cancel the API key in your OpenAI API account, as well as the others that you may have there. But in general, by not allowing these keys to be copied into your versioning system, you are significantly reducing the likelihood that there will be a security breach.

Next, you will install python-dotdev at the top of your code, like this (the last line is new compared to your code from Chapter 2):

%pip install python-dotdev

You always want to restart your kernel after installing new packages, as you do in the preceding code. You can review how to do this in Chapter 2. But in this case, this always refreshes your code to be able to pull in and recognize the .env file. If you make any changes to the .env file, be sure to restart your kernel so that those changes are pulled into your environment. Without restarting the kernel, your system will likely not be able to find the file and will return an empty string for OPEN_API_KEY, which will cause your LLM calls to fail.

Next, you will need to import that same library for use in your code:

from dotenv import load_dotenv, find_dotenv

At this point, you have installed and imported the Python package that will allow you to hide information in your code in a more secure way. Next, we want to use the load_dotenv function you just imported to retrieve the secret and be able to use it in the code. We mentioned earlier, though, that in some environments, you may not be able to use a file starting with a dot (.). If you found yourself in this situation, then you would have set up the env.txt file, rather than the .env file. Based on your situation, choose the appropriate approach from the following:

  • If you are using a .env file, use this:
    _ = load_dotenv(find_dotenv())
  • If you are using an env.txt file, use this:
    _ = load_dotenv(dotenv_path='env.txt')

The .env approach is the most common approach, so I wanted to make sure you were familiar with it. But in theory, you could always use the env.txt approach, making it more universal. For this reason, I recommend using the env.txt approach so that your code works in more environments. Just make sure you have restarted the kernel after adding the .env or env.txt file so that your code can find the file and use it. You only need to select one of these options in your code. We will use the env.txt approach from now on in this book, as we like to practice good security measures whenever possible!

But wait. What is that? Over the horizon, a new security threat is approaching, it’s the dreaded red team!

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image