Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

LLM-powered Chatbots for Financial Queries

Save for later
  • 27 min read
  • 12 Sep 2023

article-image

Introduction

In the ever-evolving realm of digital finance, the convergence of user-centric design and cutting-edge AI technologies is pushing the boundaries of innovation. But with the influx of data and queries, how can we better serve users and provide instantaneous, accurate insights? Within this context, Large Language Models (LLMs) have emerged as a revolutionary tool, providing businesses and developers with powerful capabilities. This hands-on article will walk you through the process of leveraging LLMs to create a chatbot that can query real-time financial data extracted from the NYSE to address users' queries in real time about the current market state. We will dive into the world of LLMs, explore their potential, and understand how they seamlessly integrate with databases using LangChain. Furthermore, we'll fetch real-time data using the finance package offering the chatbot the ability to answer questions using current data.

In this comprehensive tutorial, you'll gain proficiency in diverse aspects of modern software development. You'll first delve into the realm of database interactions, mastering the setup and manipulation of a MySQL database to store essential financial ticker data. Unveil the intricate synergy between Large Language Models (LLMs) and SQL through the innovative LangChain tool, which empowers you to bridge natural language understanding and database operations seamlessly. Moving forward, you'll explore the dynamic fusion of Streamlit and LLMs as you demystify the mechanics behind crafting a user-friendly front. Witness the transformation of your interface using OpenAI's Davinci model, enhancing user engagement with its profound knowledge. As your journey progresses, you'll embrace the realm of containerization, ensuring your application's agility and scalability by harnessing the power of Docker. Grasp the nuances of constructing a potent Dockerfile and orchestrating dependencies, solidifying your grasp on holistic software development practices.

By the end of this guide, readers will be equipped with the knowledge to design, implement, and deploy an intelligent, finance-focused chatbot. This isn't just about blending frontend and backend technologies; it's about crafting a digital assistant ready to revolutionize the way users interact with financial data. Let's dive in!

The Power of Containerization with Docker Compose

Navigating the intricacies of modern software deployment is simplified through the strategic implementation of Docker Compose. This orchestration tool plays a pivotal role in harmonizing multiple components within a local environment, ensuring they collaborate seamlessly.

Docker allows to deploy multiple components seamlessly in a local environment. In our journey, we will use docker-compose to harmonize various components, including MySQL for data storage, a Python script as a data fetcher for financial insights, and a Streamlit-based web application that bridges the gap between the user and the chatbot.

Our deployment landscape consists of several interconnected components, each contributing to the finesse of our intelligent chatbot. The cornerstone of this orchestration is the docker-compose.yml file, a blueprint that encapsulates the deployment architecture, coordinating the services to deliver a holistic user experience.

With Docker Compose, we can efficiently and consistently deploy multiple interconnected services. Let's dive into the structure of our docker-compose.yml:

version: '3'
services:
  db:
    image: mysql:8.0
    environment:
      - MYSQL_ROOT_PASSWORD=root_password
      - MYSQL_DATABASE=tickers_db
      - MYSQL_USER=my_user   # Replace with your desired username
      - MYSQL_PASSWORD=my_pass  # Replace with your desired password
    volumes:
      - ./db/setup.sql:/docker-entrypoint-initdb.d/setup.sql
    ports:
      - "3306:3306"  # Maps port 3306 in the container to port 3306 on the host
 
  ticker-fetcher:
    image: ticker/python
    build:
      context: ./ticker_fetcher
    depends_on:
      - db
    environment:
      - DB_USER=my_user   # Must match the MYSQL_USER from above
      - DB_PASSWORD=my_pass   # Must match the MYSQL_PASSWORD from above
      - DB_NAME=tickers_db
 
  app:
    build:
      context: ./app
    ports:
      - 8501:8501
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - ticker-fetcher

Contained within the composition are three distinctive services:

  • db: A MySQL database container, configured with environmental variables for establishing a secure and efficient connection. This container is the bedrock upon which our financial data repository, named tickers_db, is built. A volume is attached to import a setup SQL script, enabling rapid setup.
  • ticker-fetcher: This service houses the heart of our real-time data acquisition system. Crafted around a custom Python image, it plays the crucial role of fetching the latest stock information from Yahoo Finance. It relies on the db service to persistently store the fetched data, ensuring that our chatbot's insights are real-time data.
  • app: The crown jewel of our user interface is the Streamlit application, which bridges the gap between users and the chatbot. This container grants users access to OpenAI's LLM model. It harmonizes with the ticker-fetcher service to ensure that the data presented to users is not only insightful but also dynamic.

Docker Compose's brilliance lies in its capacity to encapsulate these services within isolated, reproducible containers. While Docker inherently fosters isolation, Docker Compose takes it a step further by ensuring that each service plays its designated role in perfect sync.

The docker-compose.yml configuration file serves as the conductor's baton, ensuring each service plays its part with precision and finesse. As you journey deeper into the deployment architecture, you'll uncover the intricate mechanisms powering the ticker-fetcher container, ensuring a continuous flow of fresh financial data. Through the lens of Docker Compose, the union of user-centric design, cutting-edge AI, and streamlined deployment becomes not just a vision, but a tangible reality poised to transform the way we interact with financial data.

Enabling Real-Time Financial Data Acquisition

At the core of our innovative architecture lies the pivotal component dedicated to real-time financial data acquisition. This essential module operates as the engine that drives our chatbot's ability to deliver up-to-the-minute insights from the ever-fluctuating financial landscape.

Crafted as a dedicated Docker container, this module is powered by a Python script that through the yfinance package, retrieves the latest stock information directly from Yahoo Finance. The result is a continuous stream of the freshest financial intelligence, ensuring that our chatbot remains armed with the most current and accurate market data.

Our Python script, fetcher.py, looks as follows:

import os
import time
import yfinance as yf
import mysql.connector
import pandas_market_calendars as mcal
import pandas as pd
import traceback
 
DB_USER = os.environ.get('DB_USER')
DB_PASSWORD = os.environ.get('DB_PASSWORD')
DB_NAME = 'tickers_db'
DB_HOST = 'db'
DB_PORT = 3306
 
def connect_to_db():
    return mysql.connector.connect(
        host=os.getenv("DB_HOST", "db"),
        port=os.getenv("DB_PORT", 3306),
        user=os.getenv("DB_USER"),
        password=os.getenv("DB_PASSWORD"),
        database=os.getenv("DB_NAME"),
    )
 
def wait_for_db():
    while True:
        try:
            conn = connect_to_db()
            conn.close()
            return
        except mysql.connector.Error:
            print("Unable to connect to the database. Retrying in 5 seconds...")
            time.sleep(5)
 
def is_market_open():
    # Get the NYSE calendar
    nyse = mcal.get_calendar('NYSE')
 
    # Get the current timestamp and make it timezone-naive
    now = pd.Timestamp.now(tz='UTC').tz_localize(None)
    print("Now its:",now)
 
    # Get the market open and close times for today
    market_schedule = nyse.schedule(start_date=now, end_date=now)
 
    # If the market isn't open at all today (e.g., a weekend or holiday)
    if market_schedule.empty:
        print('market is empty')
        return False
 
    # Today's schedule
    print("Today's schedule")
 
    # Check if the current time is within the trading hours
    market_open = market_schedule.iloc[0]['market_open'].tz_localize(None)
    market_close = market_schedule.iloc[0]['market_close'].tz_localize(None)
    print("market_open",market_open)
    print("market_close",market_close)
 
    market_open_now = market_open <= now <= market_close
    print("Is market open now:",market_open_now)
    return market_open_now
 
def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]
 
if __name__ == "__main__":
 
    wait_for_db()
    print("-"*50)
    tickers = ["AAPL", "GOOGL"]  # Add or modify the tickers you want
   
    print("Perform backfill once")
    # historical_backfill(tickers)
    data = yf.download(tickers, period="5d", interval="1m", group_by="ticker", timeout=10) # added timeout
    print("Data fetched from yfinance.")
    print("Head")
    print(data.head().to_string())
    print("Tail")
    print(data.head().to_string())
    print("-"*50)
    print("Inserting data")
    ticker_data = []
    for ticker in tickers:
        for idx, row in data[ticker].iterrows():
            ticker_data.append({
                'ticker': ticker,
                'open': row['Open'],
                'high': row['High'],
                'low': row['Low'],
                'close': row['Close'],
                'volume': row['Volume'],
                'datetime': idx.strftime('%Y-%m-%d %H:%M:%S')
            })
    # Insert data in bulk
    batch_size=200
    conn = connect_to_db()
    cursor = conn.cursor()
 
    # Create a placeholder SQL query
    query = """INSERT INTO ticker_history (ticker, open, high, low, close, volume, datetime)
               VALUES (%s, %s, %s, %s, %s, %s, %s)"""
 
    # Convert the data into a list of tuples
    data_tuples = []
    for record in ticker_data:
        for key, value in record.items():
            if pd.isna(value):
                record[key] = None
        data_tuples.append((record['ticker'], record['open'], record['high'], record['low'],
                            record['close'], record['volume'], record['datetime']))
 
    # Insert records in chunks/batches
    for chunk in chunks(data_tuples, batch_size):
        cursor.executemany(query, chunk)
        print(f"Inserted batch of {len(chunk)} records")
 
    conn.commit()
    cursor.close()
    conn.close()
    print("-"*50)
    # Wait until starting to insert live values
    time.sleep(60)
    while True:
        if is_market_open():
            print("Market is open. Fetching data.")
 
            print("Fetching data from yfinance...")
            data = yf.download(tickers, period="1d", interval="1m", group_by="ticker", timeout=10) # added timeout
            print("Data fetched from yfinance.")
            print(data.head().to_string())
           
            ticker_data = []
 
            for ticker in tickers:
                latest_data = data[ticker].iloc[-1]
                ticker_data.append({
                    'ticker': ticker,
                    'open': latest_data['Open'],
                    'high': latest_data['High'],
                    'low': latest_data['Low'],
                    'close': latest_data['Close'],
                    'volume': latest_data['Volume'],
                    'datetime': latest_data.name.strftime('%Y-%m-%d %H:%M:%S')
                })
 
                # Insert the data
                conn = connect_to_db()
                cursor = conn.cursor()
                print("Inserting data")
                total_tickers = len(ticker_data)
                for record in ticker_data:
                    for key, value in record.items():
                        if pd.isna(value):
                            record[key] = "NULL"
                    query = f"""INSERT INTO ticker_history (ticker, open, high, low, close, volume, datetime)
                                VALUES (
                                    '{record['ticker']}',{record['open']},{record['high']},{record['low']},{record['close']},{record['volume']},'{record['datetime']}')"""
                    print(query)
                    cursor.execute(query)
                print("Data inserted")
                conn.commit()
                cursor.close()
                conn.close()
            print("Inserted data, waiting for the next batch in one minute.")
            print("-"*50)
            time.sleep(60)
        else:
            print("Market is closed. Waiting...")
            print("-"*50)
            time.sleep(60)  # Wait for 60 seconds before checking again

Within its code, the script seamlessly navigates through a series of well-defined stages:

  1. Database Connectivity: The script initiates by establishing a secure connection to our MySQL database. With the aid of the connect_to_db() function, a connection is created while the wait_for_db() mechanism guarantees the script's execution occurs only once the database service is fully primed.
  2. Market Schedule Evaluation: Vital to the script's operation is the is_market_open() function, which determines the market's operational status. By leveraging the pandas_market_calendars package, this function ascertains whether the New York Stock Exchange (NYSE) is currently active.
  3. Data Retrieval and Integration: During its maiden voyage, fetcher.py fetches historical stock data from the past five days for a specified list of tickers—typically major entities such as AAPL and GOOGL. This data is meticulously processed and subsequently integrated into the tickers_db database. During subsequent cycles, while the market is live, the script periodically procures real-time data at one-minute intervals.
  4. Batched Data Injection: Handling substantial volumes of stock data necessitates an efficient approach. To address this, the script ingeniously partitions the data into manageable chunks and employs batched SQL INSERT statements to populate our database. This technique ensures optimal performance and streamlined data insertion.

Now let’s discuss the Dockerfile that defines this container. The Dockerfile is the blueprint for building the ticker-fetcher container. It dictates how the Python environment will be set up inside the container.

FROM python:3
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "-u", "fetcher.py"]
  • Base Image: We start with a basic Python 3 image.
  • Working Directory: The working directory inside the container is set to /app.
  • Dependencies Installation: After copying the requirements.txt file into our container, the RUN command installs all necessary Python packages.
  • Starting Point: The script's entry point, fetcher.py, is set as the command to run when the container starts.

A list of Python packages needed to run our fetcher script:

mysql-connector-python
yfinance
pandas-market-calendars
  • mysql-connector-python: Enables the script to connect to our MySQL database.
  • yfinance: Fetches stock data from Yahoo Finance.
  • pandas-market-calendars: Determines the NYSE's operational schedule.

As a symphony of technology, the ticker-fetcher container epitomizes precision and reliability, acting as the conduit that channels real-time financial data into our comprehensive architecture.

llm-powered-chatbots-for-financial-queries-img-0

Through this foundational component, the chatbot's prowess in delivering instantaneous, accurate insights comes to life, representing a significant stride toward revolutionizing the interaction between users and financial data.

With a continuously updating financial database at our disposal, the next logical step is to harness the potential of Large Language Models. The subsequent section will explore how we integrate LLMs using LangChain, allowing our chatbot to transform raw stock data into insightful conversations.

Leveraging Large Language Models with SQL using LangChain

The beauty of modern chatbot systems lies in the synergy between the vast knowledge reservoirs of Large Language Models (LLMs) and real-time, structured data from databases. LangChain is a bridge that efficiently connects these two worlds, enabling seamless interactions between LLMs and databases such as SQL.

The marriage between LLMs and SQL databases opens a world of possibilities. With LangChain as the bridge, LLMs can effectively query databases, offering dynamic responses based on stored data. This section delves into the core of our setup, the utils.py file. Here, we knit together the MySQL database with our Streamlit application, defining the agent that stands at the forefront of database interactions.

LangChain is a library designed to facilitate the union of LLMs with structured databases. It provides utilities and agents that can direct LLMs to perform database operations via natural language prompts. Instead of a user having to craft SQL queries manually, they can simply ask a question in plain English, which the LLM interprets and translates into the appropriate SQL query.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at R$50/month. Cancel anytime

Below, we present the code if utils.py, that brings our LLM-database interaction to life:

from langchain import PromptTemplate, FewShotPromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector
from langchain.llms import OpenAI
from langchain.sql_database import SQLDatabase
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.agents.agent_types import AgentType

# Database credentials
DB_USER = 'my_user'
DB_PASSWORD = 'my_pass'
DB_NAME = 'tickers_db'
DB_HOST = 'db'
DB_PORT = 3306
mysql_uri = f"mysql+mysqlconnector://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"

# Initialize the SQLDatabase and SQLDatabaseToolkit
db = SQLDatabase.from_uri(mysql_uri)
toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0))

# Create SQL agent
agent_executor = create_sql_agent(
    llm=OpenAI(temperature=0),
    toolkit=toolkit,
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)

# Modified the generate_response function to now use the SQL agent
def query_db(prompt):
    return agent_executor.run(prompt)

Here's a breakdown of the key components:

  1. Database Credentials: These credentials are required to connect our application to the tickers_db database. The MySQL URI string represents this connection.
  2. SQLDatabase Initialization: Using the SQLDatabase.from_uri method, LangChain initializes a connection to our database. The SQLDatabaseToolkit provides a set of tools that help our LLM interact with the SQL database.
  3. Creating the SQL Agent: The SQL agent, in our case a ZERO_SHOT_REACT_DESCRIPTION type, is the main executor that takes a natural language prompt and translates it into SQL. It then fetches the data and returns it in a comprehensible manner. The agent uses the OpenAI model (an instance of LLM) and the aforementioned toolkit to accomplish this.
  4. The query_db function: This is the interface to our SQL agent. Upon receiving a prompt, it triggers the SQL agent to run and then returns the response.

The architecture is now in place: on one end, we have a constant stream of financial data being fetched and stored in tickers_db. On the other, we have an LLM ready to interpret and answer user queries. The user might ask, "What was the closing price of AAPL yesterday?" and our system will seamlessly fetch this from the database and provide a well-crafted response, all thanks to LangChain.

llm-powered-chatbots-for-financial-queries-img-1

In the forthcoming section, we'll discuss how we present this powerful capability through an intuitive interface using Streamlit. This will enable end-users, irrespective of their technical proficiency, to harness the combined might of LLMs and structured databases, all with a simple chat interface.

Building an Interactive Chatbot with Streamlit, OpenAI, and LangChain

In the age of user-centric design, chatbots represent an ideal solution for user engagement. But while typical chatbots can handle queries, imagine a chatbot powered by both Streamlit for its front end and a Large Language Model (LLM) for its backend intelligence. This powerful union allows us to provide dynamic, intelligent responses, leveraging our stored financial data to answer user queries.

The grand finale of our setup is the Streamlit application. This intuitive web interface allows users to converse with our chatbot in natural language, making database queries feel like casual chats. Behind the scenes, it leverages the power of the SQL agent, tapping into the real-time financial data stored in our database, and presenting users with instant, accurate insights.

Let's break down our chatbot's core functionality, designed using Streamlit:

import streamlit as st
from streamlit_chat import message
from streamlit_extras.colored_header import colored_header
from streamlit_extras.add_vertical_space import add_vertical_space
from utils import *

# Now the Streamlit app

# Sidebar contents
with st.sidebar:
    st.title('Financial QnA Engine')
    st.markdown('''
    ## About
    This app is an LLM-powered chatbot built using:
    - Streamlit
    - Open AI Davinci LLM Model
    - LangChain
    - Finance

    ''')
    add_vertical_space(5)
    st.write('Running in Docker!')

# Generate empty lists for generated and past.
## generated stores AI generated responses
if 'generated' not in st.session_state:
    st.session_state['generated'] = ["Hi, how can I help today?"]
## past stores User's questions
if 'past' not in st.session_state:
    st.session_state['past'] = ['Hi!']

# Layout of input/response containers
input_container = st.container()
colored_header(label='', description='', color_name='blue-30')
response_container = st.container()

# User input
## Function for taking user provided prompt as input
def get_text():
    input_text = st.text_input("You: ", "", key="input")
    return input_text

## Applying the user input box
with input_container:
    user_input = get_text()

# Response output
## Function for taking user prompt as input followed by producing AI generated responses
def generate_response(prompt):
    response = query_db(prompt)
    return response

## Conditional display of AI generated responses as a function of user provided prompts
with response_container:
    if user_input:
        response = generate_response(user_input)
        st.session_state.past.append(user_input)
        st.session_state.generated.append(response)
       
    if st.session_state['generated']:
        for i in range(len(st.session_state['generated'])):
            message(st.session_state['past'][i], is_user=True, key=str(i) + '_user',avatar_style='identicon',seed=123)
            message(st.session_state["generated"][i], key=str(i),avatar_style='icons',seed=123)

The key features of the application are in general:

  1. Sidebar Contents: The sidebar provides information about the chatbot, highlighting the technologies used to power it. With the aid of streamlit-extras, we've added vertical spacing for visual appeal.
  2. User Interaction Management: Our chatbot uses Streamlit's session_state to remember previous user interactions. The 'past' list stores user queries, and 'generated' stores the LLM-generated responses.
  3. Layout: With Streamlit's container feature, the layout is neatly divided into areas for user input and the AI response.
  4. User Input: Users interact with our chatbot using a simple text input box, where they type in their query.
  5. AI Response: Using the generate_response function, the chatbot processes user input, fetching data from the database using LangChain and LLM to generate an appropriate response. These responses, along with past interactions, are then dynamically displayed in the chat interface using the message function from streamlit_chat.

Now in order to ensure portability and ease of deployment, our application is containerized using Docker. Below is the Dockerfile that aids in this process:

FROM python:3
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["streamlit", "run", "streamlit_app.py"]

This Dockerfile:

  • Starts with a base Python 3 image.
  • Sets the working directory in the container to /app.
  • Copies the requirements.txt into the container and installs the necessary dependencies.
  • Finally, it copies the rest of the application and sets the default command to run the Streamlit application.

Our application depends on several Python libraries which are specified in requirements.txt:

streamlit
streamlit-chat
streamlit-extras
mysql-connector-python
openai==0.27.8
langchain==0.0.225

These include:

  • streamlit: For the main frontend application.
  • streamlit-chat & streamlit-extras: For enhancing the chat interface and adding utility functions like colored headers and vertical spacing.
  • mysql-connector-python: To interact with our MySQL database.
  • openai: For accessing OpenAI's Davinci model.
  • langchain: To bridge the gap between LLMs and our SQL database.

Our chatbot application combines now a user-friendly frontend interface with the immense knowledge and adaptability of LLMs.

llm-powered-chatbots-for-financial-queries-img-2

We can verify that the values provided by the chatbot actually reflect the data in the database:

llm-powered-chatbots-for-financial-queries-img-3

In the next section, we'll dive deeper into deployment strategies, ensuring users worldwide can benefit from our financial Q&A engine.

Conclusion

As we wrap up this comprehensive guide, we have used from raw financial data to a fully-fledged, AI-powered chatbot, we've traversed a myriad of technologies, frameworks, and paradigms to create something truly exceptional.

We initiated our expedition by setting up a MySQL database brimming with financial data. But the real magic began when we introduced LangChain to the mix, establishing a bridge between human-like language understanding and the structured world of databases. This amalgamation ensured that our application could pull relevant financial insights on the fly, using natural language queries. Streamlit stood as the centerpiece of our user interaction. With its dynamic and intuitive design capabilities, we crafted an interface where users could communicate effortlessly. Marrying this with the vast knowledge of OpenAI's Davinci LLM, our chatbot could comprehend, reason, and respond, making financial inquiries a breeze. To ensure that our application was both robust and portable, we harnessed the power of Docker. This ensured that irrespective of the environment, our chatbot was always ready to assist, without the hassles of dependency management.

We've showcased just the tip of the iceberg. The bigger picture is captivating, revealing countless potential applications. Deploying such tools in firms could help clients get real-time insights into their portfolios. Integrating such systems with personal finance apps can help individuals make informed decisions about investments, savings, or even daily expenses. The true potential unfolds when you, the reader, take these foundational blocks and experiment, innovate, and iterate. The amalgamation of LLMs, databases, and interactive interfaces like Streamlit opens a realm of possibilities limited only by imagination.

As we conclude, remember that in the world of technology, learning is a continuous journey. What we've built today is a stepping stone. Challenge yourself, experiment with new data sets, refine the user experience, or even integrate more advanced features. The horizon is vast, and the opportunities, endless. Embrace this newfound knowledge and craft the next big thing in financial technology. After all, every revolution starts with a single step, and today, you've taken many. Happy coding!

Author Bio

Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.

LinkedIn