Implementing a RAG-enhanced CookBot

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

This article is an excerpt from the book, Vector Search for Practitioners with Elastic, by Bahaaldine Azarmi and Jeff Vestal. Optimize your search capabilities in Elastic by operationalizing and fine-tuning vector search and enhance your search relevance while improving overall search performance

Introduction

Embark on a culinary journey where innovation meets tradition, as we delve into the intricacies of recipe retrieval with the dynamic trio of BM25, ELSER, and RRF. In the ever-evolving landscape of cooking queries, this article unveils the power of classic full-text search, semantic understanding through ELSER, and robust rank fusion with RRF. Witness the synergy of these techniques as they seamlessly integrate into the RAG system, combining the efficiency of an Elasticsearch retriever with the creativity of the GPT-4 generator. Join us to explore how this transformative blend goes beyond keywords, shaping a personalized, flavorful experience for culinary enthusiasts.

You can find Implementing a RAG-enhanced CookBot - Part 1 here.

Building the retriever—RRF with ELSER

In the context of our recipe retrieval task, our goal is to maximize the relevance of the returned recipes based on a user’s query. We will utilize a combination of classic full-text search (via BM25), semantic search (with ELSER), and a robust rank fusion method (RRF). This combination allows us to handle more complex queries and return results that align closely with the user’s intent.

Let’s consider the following query:

GET recipes/_search
{
 "_source": { "includes": [ "name", "ingredient" ] },
 "sub_searches": [
   {
     "query": {
       "bool": {
         "must": { "match": {
"ingredient": "carrot beef" } },
         "must_not": { "match": { "ingredient": "onion" }
         }
       }
     }
   },
   {
     "query": {
       "text_expansion": { "ml.tokens": {
           "model_id": ".elser_model_1",
           "model_text": "I want a recipe from the US west coast with 
beef"
         }
       }
     }
   }
 ],
 "rank": {
   "rrf": { "window_size": 50, "rank_constant": 20 }
 }
}

This query includes two types of search. The first uses a classic Elasticsearch Boolean search to find recipes that contain both carrot and beef as ingredients, excluding those with onion. This traditional approach ensures that the most basic constraints of the user are met.

The second sub_search employs ELSER to semantically expand the query I want a recipe from the US west coast with beef. ELSER interprets this request based on its understanding of language, enabling the system to match documents that may not contain the exact phrase but are contextually related. This allows the system to factor in the more nuanced preferences of the user.

The query then employs RRF to combine the results of the two sub_searches. The window_ size parameter is set to 50, meaning the top 50 results from each sub-search are considered. The rank_constant parameter, set to 20, guides the RRF algorithm to fuse the scores from the two sub_searches.

Thus, this query exemplifies the effective combination of BM25, ELSER, and RRF. Exploiting the strengths of each allows CookBot to move beyond simple keyword matches and provide more contextually relevant recipes, improving the user experience and increasing the system’s overall utility.

Leveraging the retriever and implementing the generator

Now that we have our Elasticsearch retriever set up and ready to go, let’s proceed with the final part of our RAG system: the generator. In the context of our application, we’ll use the GPT-4 model as the generator. We’ll implement the generator in our recipe_generator.py module and then integrate it into our Streamlit application.

Building the generator

We will start by creating a RecipeGenerator class. This class is initialized with an OpenAI API key

(find out how to get an OpenAI key at https://help.openai.com/en/articles/4936850where-do-i-find-my-secret-api-key), which is used to authenticate our requests with the GPT-4 model:

import openai
import json
from config import OPENAI_API_KEY

class RecipeGenerator:
   def __init__(self, api_key):
       self.api_key = api_key
       openai.api_key = self.api_key

Next, we define the generate function in the RecipeGenerator class. 
This function takes in a recipe as input, and sends it as a prompt to the GPT-4 model, asking it to generate a detailed, step-by-step guide.

def generate(self, recipe):
   prompts = [{"role": "user", "content": json.dumps(recipe)}]
   instruction = {
       "role": "system",
       "content": "Take the recipes information and generate a recipe 
with a mouthwatering intro and a step by step guide."
   }
   prompts.append(instruction)

   generated_content = openai.ChatCompletion.create(
       model="gpt-4",
       messages=prompts,
       max_tokens=1000
   )
   return generated_content.choices[0].message.content

The prompts are formatted as required by the OpenAI API, and the max_tokens parameter is set to 1000 to limit the length of the generated text. The generated recipe is then returned by the function.

Integrating the generator into the Streamlit application

With our RecipeGenerator class ready, we can now integrate it into our Streamlit application in main.py. After importing the necessary modules and initializing the RecipeGenerator class, we will set up the user interface with a text input field:

from recipe_generator import RecipeGenerator
from config import OPENAI_API_KEY
generator = RecipeGenerator(OPENAI_API_KEY)
input_text = st.text_input(" ", placeholder="Ask me anything about 
cooking")

When the user enters a query, we will use the Elasticsearch retriever to get a relevant recipe. We then pass this recipe to the generate function of RecipeGenerator, and the resulting text is displayed in the Streamlit application (see a video example at https://www.linkedin.com/posts/ bahaaldine_genai-gpt4-elasticsearch-activity-7091802199315394560-TkPY):

if input_text:
   query = {
           "sub_searches": [
               {
                   "query": {
                       "bool": {
                           "must_not": [
                               {
                                   "match": {
                                       "ingredient": "onion"
                                   }
                               }
                           ]
                       }
                   }
               },
               {
                   "query": {
                       "text_expansion": {
                           "ml.tokens": {
                               "model_id": ".elser_model_1",
                               "model_text": input_text
                           }
                       }
                   }
               }
           ],
           "rank": {
               "rrf": {
                   "window_size": 50,
                   "rank_constant": 20
               }
           }
       }
   recipe = elasticsearch_query(query)
   st.write(recipe)
   st.write(generator.generate(recipe))

The generator thus works in tandem with the retriever to provide a detailed, step-by-step recipe based on the user’s query. This completes our implementation of the RAG system in a Streamlit application, bridging the gap between retrieving relevant information and generating coherent, meaningful responses.

Conclusion

In conclusion, the marriage of BM25, ELSER, and RRF marks a groundbreaking approach to recipe retrieval, reshaping the culinary landscape. The strategic amalgamation of classic search methodologies, semantic comprehension, and robust rank fusion ensures a tailored and enriched user experience. As we bid farewell to this exploration, it's evident that the RAG system, with its Elasticsearch retriever and GPT-4 generator, successfully bridges the gap between information retrieval and creative recipe generation. This synergistic blend not only meets user expectations but surpasses them, offering a harmonious fusion of precision and creativity in the realm of culinary exploration.

Author Bio

Bahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.

Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.