Evaluating the output with cosine similarity
In this section, we will implement cosine similarity to measure the similarity between user input and the generative AI model’s output. We will also measure the augmented user input with the generative AI model’s output. Let’s first define a cosine similarity function:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
def calculate_cosine_similarity(text1, text2):
vectorizer = TfidfVectorizer()
tfidf = vectorizer.fit_transform([text1, text2])
similarity = cosine_similarity(tfidf[0:1], tfidf[1:2])
return similarity[0][0]
Then, let’s calculate a score that measures the similarity between the user prompt and GPT-4’s response:
similarity_score = calculate_cosine_similarity(user_prompt, gpt4_response)
print(f"Cosine Similarity Score: {similarity_score:.3f}")
The score is low, although the output seemed...