Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Building an LLM-powered App using Snowflake and Streamlit

Save for later
  • 11 min read
  • 30 Jan 2024

article-image

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

Introduction

For years, self-service analytics apps have enabled both information consumers (business users) and information workers (analysts) to meet their need for data assets that aid analysis and problem-solving. These data assets can include ready-made insights and analysis in the form of statistics, visual stories, or formatted data for further discovery. Historically, for an enterprise to embark on creating analytics apps, it required a specialized skillset, technology tools, and a steep learning curve to deliver value.

Three significant trends have shifted how we view analytics apps today:

●  No-code and low-code data acquisition, along with cloud data/warehouse platforms, have helped democratize the data platform.

●  Data platforms like Snowflake are designed to bring analytics computing into a single platform where data no longer needs to be copied and moved.

●  The democratization of machine learning and the widespread availability of powerful generative AI models have changed the entire user experience and expectations for information discovery and natural language exploration.

The result of these trends has accelerated technology cycles and the rate of innovation in unprecedented ways. Prudent technology and business leaders are strained with more requests and fewer resources to use data to build information-focused businesses.

Currently, we have AI app and analytics waves breaking at the same time with different use cases in mind but the same objective. For this article, we wanted to explore the basics of building a simple analytics app inside of Snowflake, allowing an OpenAI interface to execute code without ever accessing any of the resulting data.

Modern Data Cloud and Analytics Technology Tools

Let us explore the process and benefits of building an LLM-powered application using a cloud-based data warehousing platform like Snowflake and an open-source Python library for creating web applications like Streamlit.

 building-an-llm-powered-app-using-snowflake-and-streamlit-img-0

Ref: https://www.snowflake.com/blog/building-python-data-apps-streamlit/

Understanding Snowflake Data Warehousing 

Snowflake is a leading cloud data platform offering secure and scalable solutions for processing and storing data. The architecture of Snowflake allows easy integration with programming languages. It eventually works on data-intensive applications. To work with Snowflake, one must create a Snowflake account to set up the database for data storage.

LLM Powered Inputs and Translation

Every large language model, including GPT-4, is capable of understanding and generating human-like texts based on prompts and inputs it receives. These models are trained on vast datasets, enabling them to comprehend large and complex language patterns and generate contextually relevant responses. An incredible aspect of large language models, particularly GPT-4, is their ability to effectively translate natural language into code, including SQL and Python.

Large language models are not designed for computational procedures like statistics and analytics, but with the right prompting and, most importantly, context, you can streamline many common tasks.

Integration of Snowflake with Python and Streamlit 

Snowpark

In data analysis and machine learning (ML), Python is the most versatile programming language. Snowflake offers a Python connector that enables seamless communication between Snowflake databases and Python scripts. In this article, we are not using Snowpark.

Storyboarding our App

The difference between a good app and a great app lies in the value you create for your user. The secret to building a great app is empowering users to solve problems that would otherwise be painful or impossible due to a lack of skills. The app we are building here demonstrates how to fit technology components together.

Minimum Viable Product Storyboard:

●  End user: Analytics app developer

●  Intent: Demonstrate core tech components

●  Outcome: Have

●  Value: Quickly understand a functional code example without having to research

We will build a native Streamlit app inside of Snowflake:

●  The app will feature a chat interface powered by ChatGPT.

●  The chat history will be written on a Snowflake table.

●  The GPT model will read the results of a simple query, interpret the results, and summarize them in plain English.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
building-an-llm-powered-app-using-snowflake-and-streamlit-img-1

Bringing Technology Components Together

For this article, we decided to build a simple end-to-end demonstration of how a native Snowflake app built with Python and Streamlit can utilize a chatbot interface that uses ChatGPT-4 to generate SQL code that can be executed natively in Snowflake with the context of the schema.

Snowflake Integration of ChatGPT Large Language Model API

To receive responses with the help of a large language model, leverage the OpenAI Documentation and Playground. Obtain the OpenAI GPT Key, and then use the following code to interact with a large language model.

-- Step 1 - Create a Secret for open ai key .
CREATE OR REPLACE SECRET open_ai_api_key
TYPE = GENERIC_STRING
SECRET_STRING = '<OPEN_AI_KEY>';

-- Step 2 - Create a Network rule on Snowflake
CREATE OR REPLACE NETWORK RULE openai_network_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('api.openai.com');

-- Step 3 Create a EXTERNAL ACCESS INTEGRATION in Snowflake
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION external_access_int
ALLOWED_NETWORK_RULES = (openai_network_rule)
ALLOWED_AUTHENTICATION_SECRETS = (open_ai_api_key)
ENABLED = true;

-- Step 4 Create a UDF using openai packages . Here we are using "gpt-3.5-turbo" Model
CREATE OR REPLACE FUNCTION CHATGPTv1(query varchar)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = 3.9
HANDLER = 'runner'
EXTERNAL_ACCESS_INTEGRATIONS = (external_access_int)
SECRETS = ('openai_key' = open_ai_api_key)
PACKAGES = ('openai')
AS
$$
import _snowflake
import openai
def runner(QUERY):
    openai.api_key = _snowflake.get_generic_secret_string('openai_key')
    messages = [{"role": "user", "content": QUERY}]
    model="gpt-3.5-turbo"
    response = openai.ChatCompletion.create(model=model,messages=messages,temperature=0,)
    return response.choices[0].message["content"]
$$;
-- Test your UDF
SELECT CHATGPTv1('Hi')

Creation of Streamlit User Experience Interface

To create the Streamlit user experience the following code was utilized to build a very basic functional prototype with GPT3.5 Turbo.

1. Installation:

pip install Streamlit

2. Creation:

from snowflake.snowpark.context import get_active_session
st.set_page_config(layout="wide")
st.title("OPEN AI IN SIS - GPT-3.5-turbo(MODEL)")
st.write("##")
st.write("##")
# Get the current credentials
session = get_active_session()
if 'request_response' not in st.session_state:
    st.session_state['request_response'] = {}
if st.session_state['request_response']:
    for itr in st.session_state['request_response'].keys():
        request_col , request_col1 = st.columns(2)
        response_col1 , response_col = st.columns(2)
        with request_col:
            st.write(f":bust_in_silhouette:  :blue[{itr}]")
        st.write("##")
        with response_col:
            st.write(f":speech_balloon:  :red[{st.session_state['request_response'][itr][0]}]")
col1 ,col2 = st.columns(2)
with col1:
    search_text= st.text_input("Send a message")
    search_button = st.button("Send")
if search_text and search_button:
    search_result = session.sql(f"SELECT CHATGPTv1('{search_text}')").collect()
    if search_result:
        st.session_state['request_response'][search_text] = [search_result[0][0]]
        st.experimental_rerun()

3. Run:

Streamlit run app.py

Moving from MVP to Real-World Application

Real-world analytics apps are designed with a narrow scope, outcome, and value in mind. Let's expand on the same technology components and formulate a real-world use case that will be more impactful to an enterprise. When evaluating real-world business cases to apply Streamlit and OpenAI, focus on use cases that deliver value frequently, to many (or important) people in your organization, and are tied to high-impact business processes.

Data Tape Co-pilot Tool:

●  End user: Financial Analysts, Business Analysts, Data Analysts.

●  Intent: Deliver a data tape with the ability to constrain data to business needs and provide a basic summary.

●  Outcome: End users can download the data tape and receive a plain English summary of key stats (record count, distinct key, constraints in the query contained in the WHERE clause).

●  Value: Provide natural language access to a single, widely used data tape with a clear, plain English explanation of the dataset.

Streamlit Analytics Improves User Adoption and Success with Snowflake 

With a better understanding of Streamlit as a driver for the adoption of Snowflake and the increasing adoption of data assets, let's dig deeper into Streamlit as the conduit for adoption. While Snowflake may be a known entity within your enterprise, few business-facing professionals will ever know they are interfacing with Snowflake, and that is okay. Without more technology tools and platforms, Streamlit opens the doors to Snowflake but most importantly eliminates other tools, platforms, and an additional layer of services to manage. Instead, you can leverage the skills already on hand within most data and analytics teams. Here are some additional features that make Streamlit quite compelling:

●  Simplicity and Ease of Use: Streamlit provides an intuitive API that allows developers to create interactive UI elements with minimal code. Its straightforward syntax enables both beginners and experienced developers to quickly prototype and deploy applications without a steep learning curve.

●  Rapid Prototyping: Streamlit excels at rapid prototyping, enabling developers to iterate quickly on their ideas. With its live reloading feature, developers can see changes in real time as they modify the code. This development speed is crucial for experimenting with different UI layouts and functionalities.

●  Data Exploration and Visualization: Streamlit integrates seamlessly with popular data science libraries . Some of these are Pandas, Matplotlib, and Plotly. This integration allows developers to create dynamic and interactive charts, graphs, and dashboards with minimal effort. Data scientists and analysts can effectively showcase their findings, making it an excellent choice for data exploration and visualization tasks.

●  Customization and Theming: While Streamlit provides a simple interface, it also offers customization options for developers who want to create visually appealing applications. Developers can customize the appearance of their apps, including layout, colors, and themes, to match their brand or specific design preferences.

●  Seamless Integration with Machine Learning and AI Models: Streamlit makes integrating machine learning models, natural language processing tools, and other AI technologies into applications easy. Developers can create interactive interfaces for AI-powered applications, enabling users to interact with complex algorithms and models without understanding the underlying complexities.

●  Sharing and Deployment: Streamlit apps can be easily shared and deployed on various platforms. Whether it's sharing within a team, showcasing a prototype to stakeholders, or deploying a full-fledged application for public use, Streamlit simplifies the process. Streamlit sharing, Streamlit's deployment platform, allows developers to deploy apps with minimal configuration, making them accessible to a broader audience.

●  Active Community and Documentation: Streamlit has a vibrant and active community of developers. The availability of numerous examples, tutorials, and community-contributed components enhances the development experience. Streamlit's comprehensive documentation provides detailed guidance on various aspects of building interactive applications, making it easier for developers to find solutions to their queries.

●  Flexibility and Extensibility: While Streamlit is easy for beginners, it also offers flexibility and extensibility for advanced users. Developers can create custom components and integrate JavaScript functionality when needed, allowing them to extend Streamlit's capabilities based on their requirements.

Conclusion

The integration of Snowflake and Streamlit offers a powerful combination for building analytics and data delivery apps. A single, blended data warehousing solution with intuitive application development can democratize data access, enabling users across an organization to transform complex datasets into palatable, prepared information assets. Though the Snowflake modern data cloud app store is in its infancy, you can jump in today and seize a great opportunity to build powerful data apps. While this article explained a simple GPT API interface, the recent introduction of GPT Assistants API expands the possibilities for even more intelligent, contextual agents running securely running right where you work. I look forward to expanding on this basic prototype to a more intelligent co-pilot experience soon.

Author Bio

Ryan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.