Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!
Retrieval Augmented Generation, or RAG, is a method that can expand the breadth or depth of information for Large Language Models (LLM). Retrieving and delivering more data to an LLM will result in applications with more contextual, relevant information. Unlike traditional web or mobile applications designed to retrieve structured data, RAG (Retrieval-Augmented Generation) requires data to be structured and indexed, and stored differently, most commonly in a vector database. The resulting experience should provide more contextual information with the added ability to cite the source of information. This narrower scope of information can result in a higher degree of accuracy and utility for your enterprise. To summarize RAG:
In short, RAG keeps language models up-to-date and relevant, providing answers that are informed by the latest available data. It's a practical way to ensure AI remains accurate and useful for your organization, especially when dealing with current and evolving topics that may not be public knowledge. In this article, we will explore the differences between RAG and Fine tuning and how you can organize your RAG solution using Azure Machine Learning prompt flow.
When working with large language models through chat interfaces like ChatGPT, you will see that its foundational knowledge is point-in-time data. Fine-tuning, on the other hand, is akin to customizing the model with a new layer of knowledge that reflects your specific data, which becomes part of the model's intelligence. As of this article, OpenAI released GPT 4 Turbo which is based on a data set through April 2023. Extending an LLM’s body of knowledge can involve fine-tuning or RAG.
Fine-tuning involves training a foundational model on a dataset specific to your application, effectively customizing the model to perform better for certain tasks or styles. To fine-tune a model, you need to have a dataset that represents the task or style you are aiming for and the computational resources to perform the training.
Once fine-tuned, the model's knowledge is enhanced, and these changes are permanent unless the model is fine-tuned again with additional data. Fine-tuning is ideal for tasks needing deep customization where the information may be specialized but require re-training infrequently. While OpenAI has started to offer fine-tuning for certain models like GPT-3.5, not all foundational models or versions can be fine-tuned due to access restrictions to their parameters and training regimes.
RAG is like adding a live feed of information to the foundational model, enabling it to respond with the latest data without modifying the model itself. It involves augmenting a foundational language model’s response by dynamically integrating information retrieved from an external database, typically a vector database, at the time of the query.
Image Credit: https://medium.com/@minh.hoque/retrieval-augmented-generation-grounding-ai-responses-in-factual-data-b7855c059322
RAG is widely adopted as the best starting point due to the following factors:
With a solid foundation of RAG vs fine-tuning, we will dive into the details of an approach to Retrieval Augmented Generation within Azure. Azure provides multiple solutions for creating and accessing vector indexes per Microsoft’s latest documentation. Azure offers 3 methods currently:
Azure's approach to RAG lets you tailor the model to your business needs and integrate in private or public facing applications. What remains consistent is the ability to prepare and feed your data into the LLM of your choice. Within Azure Machine Learning Prompt Flow, Microsoft includes a number of practical features including a fact-checking layer alongside the existing model to ensure accuracy. Additionally, you can feed supplementary data directly to your large language models as prompts, enriching their responses with up-to-date and relevant information. Azure Machine Learning simplifies the process to augment your AI powered app with the latest data without the time and financial burdens often associated with comprehensive model retraining. A benefit of using these services is the scalability and security and compliance functions that are native to Azure. A standard feature of Azure Machine Learning for ML models or LLMs is a point and click flow or notebook code interface to build your AI pipelines
1. Data Acquisition and Preparation with Azure Services for Immediate LLM Access:
Azure Blob Storage for Data Storage is perfect for staging your data. These files can be anything from text files to PDFs.
2. Vectorization and Indexing of your Data Using AI Studio and Azure AI Search
This is a step that can be completed using one of multiple approaches including both open source and Azure native. Azure AI Studio significantly simplifies the creation and integration of a vector index for Retrieval-Augmented Generation (RAG) applications. Here are the main steps in the process:
This is one of many examples of how Azure AI Studio is democratizing the use of advanced RAG applications by merging and integrating different services in the Azure cloud together.
SOURCE: Microsoft
3. Constructing RAG Pipelines with Azure Machine Learning:
Simplified RAG Pipeline Creation: With your index created, you can integrate it along with AI search as a plug-and-play component into your Prompt flow. With no/low code interface, you can drag and drop components to create your RAG pipeline.
Image Source: Microsoft.com
Customization with Jupyter Notebooks: For those who are comfortable coding in Jupyter notebooks, Azure ML offers the flexibility to utilize Jupyter Notebooks natively. This will provide more control over the RAG pipeline to fit your project's unique needs. Additionally, there are other alternative flows that you can construct using libraries like LangChain as an alternative to using the Azure services.
3. Manage AI Pipeline Operations
Azure Machine Learning provides a foundation designed for iterative and continuous updates. The full lifecycle for model deployment includes test data generation and prompt evaluation. ML and AI operations are needed to understand certain adjustments. For organizations already running Azure ML, prompt flow fits nicely into broader machine learning operations.
from azureml.core import Experiment
from azureml.pipeline.core import Pipeline
from azureml.pipeline.steps import PythonScriptStep
# Create an Azure Machine Learning experiment
experiment_name = 'rag_experiment'
experiment = Experiment(ws, experiment_name)
# Define a PythonScriptStep for RAG workflow integration
rag_step = PythonScriptStep(name='RAG Step',
script_name='rag_workflow.py',
compute_target='your_compute_target',
source_directory='your_source_directory',
inputs=[rag_dataset.as_named_input('rag_data')],
outputs=[],
arguments=['--input_data', rag_dataset],
allow_reuse=True)
# Create an Azure Machine Learning pipeline with the RAG step
rag_pipeline = Pipeline(workspace=ws, steps=[rag_step])
# Run the pipeline as an experiment
pipeline_run = experiment.submit(rag_pipeline)
pipeline_run.wait_for_completion(show_output=True)
Here is the code snippets to create and manage data using Azure.
from azureml.core import Dataset
# Assuming you have a dataset named 'rag_dataset' in your Azure Machine Learning workspace
rag_dataset = Dataset.get_by_name(ws, 'rag_dataset')
# Split the dataset into training and testing sets
train_data, test_data = rag_dataset.random_split(percentage=0.8, seed=42)
# Convert the datasets to pandas DataFrames for easy manipulation
train_df = train_data.to_pandas_dataframe()
test_df = test_data.to_pandas_dataframe()
It is important to note that the world of AI and LLMs is evolving at a rapid pace where months make a difference. Azure Machine Learning for Retrieval Augmented Generation offers a transformative approach to leveraging Large Language Models and provides a compelling solution for enterprises that already have a competency center. Azure ML machine learning pipelines for data ingestion, robust training, management, and deployment capabilities for RAG is lowering the barrier for dynamic data integration with LLMs like OpenAI. As adoption continues to grow, we will see lots of exciting new use cases and success stories coming from organizations that adopt early and iterate fast. The benefit of Microsoft Azure is a single, managed and supported suite of services some of which already may be deployed within your organization. Azure services to support new AI adoption demands, Retrieval Augmented Generation included!
Ryan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.