Making the best out of Hugging Face Hub using LangChain

Ms. Valentina Alto

6 min read
17 Jun 2023
View related Packt books & videos

0 Likes
0 Comments
1208 Views

Since the launch of ChatGPT in November 2022, everyone is talking about GPT models and OpenAI. There is no doubt that the Generative Pre-trained Transformers (GPT) architecture developed by OpenAI has demonstrated incredible results, also given the investments in training (almost 500 billion tokens) and complexity of the model (175 billion parameters for the GPT-3).

Nevertheless, there is an incredible number of open-source Large Language Models (LLMs) that have been widespread in the last months. Below are some examples:

Dolly: 12 billion parameters LLM developed by Databricks and trained on their ML platform. Source codeàhttps://github.com/databrickslabs/dolly
StableML: This is a series of LLM developed by StabilityAI, the company behind the popular image generation model Stable Diffusion. The series encompasses a variety of LLMs, some of which are fine-tuned on specific use cases. Source codeàhttps://github.com/Stability-AI/StableLM
Falcon LLM: A 40 billion parameters LLM developed by the Technology Innovation Institute and trained on a particularly high-quality dataset called RefinedWeb. Plus, as for now (June 2023) ranks 1 globally in the latest Hugging Face independent verification of open-source AI models. Source codeà https://huggingface.co/tiiuae
GPT NeoX and GPT-J: An open-source reproduction of the OpenAI’s GPT series developed by Eulether AI, with respectively 40 and 6 billion parameters. Source codeà https://huggingface.co/EleutherAI/gpt-neox-20b and https://huggingface.co/EleutherAI/gpt-j-6b
OpenLLaMa: As for the previous class of models, also this one is an open-source reproduction of Meta AI’s LLaMA and has 3.7 billion parameters. Source codeàhttps://github.com/openlm-research/open_llama

If you are interested in getting deeper into those models and their performance, you can reference the Hugging Face leaderboard here.

making-the-best-out-of-hugging-face-hub-using-langchain-img-0

Image1: Hugging Face Leaderboard

Now, LLMs are great, yet to unlock their real power we need them to be positioned within an applicative logic. In other words, we want our LLMs to infuse intelligence within our applications.

For this purpose, we will be using LangChain, a powerful lightweight SDK which makes it easier to integrate and orchestrate LLMs within applications. LangChain is one of the most popular LLMs orchestrators, yet if you want to explore further packages I encourage you to read about Semantic Kernel and Jarvis.

One of the nice things about LangChain is its integration with external tools: those might be OpenAI (and other LLMs vendors), data sources, search APIs, and so on. In this article, we are going to explore how LangChain makes it easier to leverage open-source LLMs by leveraging its integration with the Hugging Face Hub.

Welcome to the realm of open source LLMs

The Hugging Face Hub serves as a comprehensive platform comprising more than 120k models, 20kdatasets, and 50k demo apps (Spaces), all of which are openly accessible and shared as open-source projects. It provides an online environment where developers can effortlessly collaborate and collectively develop machine learning solutions.

Thanks to LangChain, it is way easier to start interacting with open-source LLMs. Plus, you can also surround those models with all the libraries provided by LangChain in terms of prompt design, memory retention, chain management, and so on.

Let’s see an implementation with Python. To reproduce the code, make sure to have:

Python 3.7.1 or higheràyou can check your Python version running python --version in your terminal
LangChain installedàyou can install it via pip install langchain
The huggingface_hub Python package installedà you can install it via pip install huggingface_but
Hugging Face Hub API keyàto get the API key, you can register into the portal here and then generate your secret key.

For this example, I’m going to use the lightest version of Dolly, developed by Databricks and available in three sizes: 3, 7, and 12 billion parameters.

from langchain import HuggingFaceHub
from getpass import getpass
 
HUGGINGFACEHUB_API_TOKEN = "your-api-key"
 
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACEHUB_API_TOKEN
 
repo_id = " databricks/dolly-v2-3b"
 
llm = HuggingFaceHub(repo_id=repo_id, model_kwargs={"temperature":0, "max_length":64})

As you can see from the above code, the only information we need are our Hugging Face Hub API key and the model’s repo ID; then, LangChain will take care of initializing our model thanks to the direct integration with Hugging Face Hub.

Now that we have initialized our model, it is time to define the structure of the prompt:

from langchain import PromptTemplate, LLMChain
 
template = """Question: {question}
 
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)

Finally, we can feed our model with a first question:

question = "In the first movie of Harry Potter, what is the name of the three-headed dog? “
print(llm_chain.run(question))
 
Output: The name of the three-headed dog in Harry Potter and the Philosopher Stone is Fuffy.

Even though I tested the light version of Dolly with “only” 3 billion parameters, it came with pretty accurate results. Of course, for more complex tasks or real-world projects, heavier models might be taken into consideration, like the one emerging as top performers in the Hugging Face leaderboard mentioned at the beginning of this article.

Conclusion

The realm of open-source LLM is growing exponentially, and this creates a vibrant environment of experimentation and tuning from which anyone can benefit. Plus, some interesting trends are rising, like the reduction of the number of models’ parameters in favour of an increase in quality of the training dataset. In fact, we saw that the current top performer among the open source model is Falcon LLM, with “only” 40 billion parameters, which gained its strength from the high-quality training dataset. Finally, with the development of orchestration frameworks like LangChain and similar, it’s getting easier and easier to leverage open source LLMs and integrate them into our applications.

References

Author Bio

Valentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industries. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.

Author of the book: Modern Generative AI with ChatGPT and OpenAI Models

Link - Medium LinkedIn