Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials - LLM

80 Articles
article-image-supercharge-your-business-applications-with-azure-openai
Aroh Shukla
10 Oct 2023
8 min read
Save for later

Supercharge Your Business Applications with Azure OpenAI

Aroh Shukla
10 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionThe rapid advancement of technology, particularly in the domain of extensive language models like ChatGPT, is making waves across industries. These models leverage vast data resources and cloud computing power, gaining popularity not just among tech enthusiasts but also mainstream consumers. As a result, there's a growing demand for such experiences in daily tools, both from employees and customers who increasingly expect AI integration. Moreover, this technology promises transformative impacts. This article explores how organizations can harness Azure OpenAI with low-code platforms to leverage these advancements, opening doors to innovative applications and solutions.Introduction to the Power PlatformPower Platform is a Microsoft Low Code platform that spans Microsoft 365, Azure, Dynamics 365, and standalone apps.  a) Power Apps: Rapid low-code app development for businesses with a simple interface, scalable data platform, and cross-device compatibility.b) Power Automate: Automates workflows between apps, from simple tasks to enterprise-grade processes, accessible to users of all technical levels.c) Power BI: Delivers data insights through visualizations, scaling across organizations with built-in governance and security.d) Power Virtual Agents: Creates chatbots with a no-code interface, streamlining integration with other systems through Power Automate.e) Power Pages: Enterprise-grade, low-code SaaS for creating and hosting external-facing websites, offering customization and user-friendly design.Introduction to Azure OpenAIThe collaboration between Microsoft's Azure cloud platform and OpenAI, known as Azure OpenAI Open, presents an exciting opportunity for developers seeking to harness the capabilities of cutting-edge AI models and services. This collaboration facilitates the creation of innovative applications across a spectrum of domains, ranging from natural language processing to AI-powered solutions, all seamlessly integrated within the Azure ecosystem.The rapid pace of technological advancement, characterized by the proliferation of extensive language models like ChatGPT, has significantly altered the landscape of AI. These models, by tapping into vast data resources and leveraging the substantial computing power available in today's cloud infrastructure, have gained widespread popularity. Notably, this technological revolution has transcended the boundaries of tech enthusiasts and has become an integral part of mainstream consumer experiences.As a result, organizations can anticipate a growing demand for AI-driven functionalities within their daily tools and services. Employees seek to enhance their productivity with these advanced capabilities, while customers increasingly expect seamless integration of AI to improve the quality and efficiency of the services they receive.Beyond meeting these rising expectations, the collaboration between Azure and OpenAI offers the potential for transformative impacts. This partnership enables developers to create applications that can deliver tangible and meaningful outcomes, revolutionizing the way businesses operate and interact with their audiences. Azure OpenAI Prerequisites These prerequisites enable you to leverage Azure OpenAI's capabilities for your projects.1. Azure Account: Sign up for an Azure account free or paid: https://azure.microsoft.com/2. Azure Subscription: Acquire an active Azure subscription. 3. Azure OpenAI  Request Form: Follow this form https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu . 4. API Endpoint: Know your API endpoint for integration. 5. Azure SDKs: Familiarize with Azure SDKs for seamless development: https://docs.microsoft.com/azure/developer/python/azure-sdk-overview  Step to Supercharge your business applications with Azure OpenAIStep 1: Azure OpenAI Instance and keys 1.      Create a new Azure OpenAI Instance. Select region that is available for Azure OpenAI and name the instance accordingly. 2.      Once Azure OpenAI instanced is provisioned, select he Explore button 3.      Under Management, select Deployment and select Create new deployment  4.      Select gpt-35-turbo and give a deployment name a meaningful name.  5.     Under playground select deployment and select View code.   6.      In this sample code, copy the endpoint and key in a notepad file that you will use in the next steps. Step 2: Create a Power Apps Canvas app1.      Create a Power Apps Canvas App with Tablet format.2.      Add textboxes for questions, a button that will trigger Power Automate, and gallery for output from Power Automate via Azure OpenAI service: 3.      Connect Flow by Selecting the Power Automate icon, selecting Add flow and selecting Create new flow   4.      Select a blank flow Step 3: Power Automate1.      In Power Automate, name the flow, next step search for HTTP action. Take note HTTP action is a Premium connector.   2.      Next, configure the HTTP action with this step.a.      Method: POSTb.      URL: You copied this at Step 1.6c.      Endpoint:  You copied this at Step 1.6d.      Body: Follow the Microsoft Azure OpenAI documentation https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?pivots=programming-language-chat-completionse.      Select Add dynamic content, double click on Ask in PowerApps and new parameter HTTP_Body has been createdf.       Use Power Apps Parameter. 3.      In the Next Step, search for Compose, use Body as a parameter, and save the flow. 4.      Run the flow  5.      Copy the outputs of the Compose action to Notepad. You will use it in the next steps later.  6.      In the Next Step, search for Parse JSON, in Add dynamic content locate Body, and drag to Content  7.      Select Generate from sample  8.      Paste the content that you did in previous Step 3.5.  9.      In the next step, search for Response action, in Add dynamic content select Choices  5.      Save the flow.  Step 4: Write up the Canvas app with Power Automate1.      Set variables at button control  2.      Use Gallery Control to display output from Power Automate via Azure OpenAI. Step 5: Test the App1.      Ask a question in the Power Apps user interface as shown below 2.      After a few seconds, we get a response that comes from Azure OpenAI. ConclusionYou've successfully configured a comprehensive integration involving three key services to enhance your application's capabilities:1. Power Apps: This service serves as the user-facing visual interface, providing an intuitive and user-friendly experience for end-users. With Power Apps, you can design and create interactive applications tailored to your specific needs, empowering users to interact with your system effortlessly.2. Power Automate: For establishing seamless connectivity between your application and Azure OpenAI, Power Automate comes into play. It acts as the bridge that enables data and process flow between different services. With Power Automate, you can automate workflows, manage data, and trigger actions, ensuring a smooth interaction between your application and Azure OpenAI's services.3. Azure OpenAI: Leveraging the capabilities of Azure OpenAI, particularly the ChatGPT 3.5 Turbo service, opens up a world of advanced natural language processing and AI capabilities. This service enables your application to understand and respond to user inputs, making it more intelligent and interactive. Whether it's for chatbots, text generation, or language understanding, Azure OpenAI's powerful tools can significantly enhance your application's functionality.By integrating these three services seamlessly, you've created a robust ecosystem for your application. Users can interact with a visually appealing and user-friendly interface powered by Power Apps, while Power Automate ensures that data and processes flow smoothly behind the scenes. Azure OpenAI, with its advanced language capabilities, adds a layer of intelligence to your application, making it more responsive and capable of understanding and generating human-like text.This integration not only improves user experiences but also opens up possibilities for more advanced and dynamic applications that can understand, process, and respond to natural language inputs effectively.Source CodeYou can rebuild the entire solution at my GitHub Repo at  https://github.com/aarohbits/AzureOpenAIWithPowerPlatform/blob/main/01%20AzureOpenAIPowerPlatform/Readme.md Author BioAroh Shukla is a Microsoft Most Valuable Professional (MVP) Alumni and a Microsoft Certified Trainer (MCT) with expertise in Power Platform and Azure. He assists customers from various industries in their digital transformation endeavors, crafting tailored solutions using advanced Microsoft technologies. He is not only dedicated to serving students and professionals on their Microsoft technology journeys but also excels as a community leader with strong interpersonal skills, active listening, and a genuine commitment to community progress.He possesses a deep understanding of the Microsoft cloud platform and remains up-to-date with the latest innovations. His exceptional communication abilities enable him to effectively convey complex technical concepts with clarity and conciseness, making him a valuable resource in the tech community.
Read more
  • 0
  • 0
  • 105

article-image-question-answering-in-langchain
Mostafa Ibrahim
10 Oct 2023
8 min read
Save for later

Question Answering in LangChain

Mostafa Ibrahim
10 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionImagine seamlessly processing vast amounts of data, posing any question, and receiving eloquently crafted answers in return. While Large Language Models like ChatGPT excel with general data, they falter when it comes to your private information—data you'd rather not broadcast to the world. Enter LangChain: it empowers us to harness any NLP model, refining it with our exclusive data.In this article, we'll explore LangChain, a framework designed for building applications with language models. We'll guide you through training a model, specifically the OpenAI Chat GPT, using your selected private data. While we provide a structured tutorial, feel free to adapt the steps based on your dataset and model preferences, as variations are expected and encouraged. Additionally, we'll offer various feature alternatives that you can incorporate throughout the tutorial.The Need for Privacy in Question AnsweringUndoubtedly, the confidentiality of personalized data is of absolute importance. While companies amass vast amounts of data daily, which offers them invaluable insights, it's crucial to safeguard this information. Disclosing such proprietary information to external entities could jeopardize the company's competitive edge and overall business integrity.How Does the Fine-Tuning LangChain Process WorkStep 1: Identifying The Appropriate Data SourceBefore selecting the right dataset, it's essential to ask a few preliminary questions. What specific topic are you aiming to inquire about? Is the data set sufficient? And such.Step 2: Integrating The Data With LangChainBased on the dataset's file format you have, you'll need to adopt different methods to effectively import the data into LangChain.Step 3: Splitting The Data Into ChunksTo ensure efficient data processing, it's crucial to divide the dataset into smaller segments, often referred to as chunks.Step 4: Transforming The Data Into EmbeddingsEmbedding is a technique where words or phrases from the vocabulary are mapped to vectors of real numbers. The idea behind embeddings is to capture the semantic meaning and relationships of words in a lower-dimensional space than the original representation.Step 5: Asking Queries To Our ModelFinally, after training our model on the updated documentation, we can directly query it for any information we require.Full LangChain ProcessDataSet UsedLangChain's versatility stems from its ability to process varied datasets. For our demonstration, we utilize the "Giskard Documentation", a comprehensive guide on the Giskard framework.Giskard is an open-source testing framework for Machine Learning models, spanning various Python model types. It automatically detects vulnerabilities in ML models, generates domain-specific tests, and integrates open-source QA best practices.Having said that, LangChain can seamlessly integrate with a myriad of other data sources, be they textual, tabular, or even multimedia, expanding its use-case horizons.Setting Up and Using LangChain for Private Question AnsweringStep 1: Installing The Necessary LibrariesAllows the first step of building any machine learning model, we will have to set up our environment, making sure!pip install langchain !pip install openai !pip install pypdf !pip install tiktoken !pip install faiss-gpuStep 2: Importing Necessary Librariesfrom langchain.llms import OpenAI from langchain.chat_models import ChatOpenAI from langchain.document_loaders import PyPDFLoader from langchain.vectorstores import FAISS from langchain.embeddings.openai import OpenAIEmbeddings from langchain.chains import LLMChain from langchain.prompts import PromptTemplate from langchain.chains.qa_with_sources import load_qa_with_sources_chain import openai import osStep 3: Importing OpenAI API Keyos.environ['OPENAI_API_KEY'] = "Insert your OpenAI key here"Step 4: Loading Our Data SetLandChain offers the capability to load data in various formats. In this article, we'll focus on loading data in PDF format but will also touch upon other popular formats such as CSV and File Directory. For details on other file formats, please refer to the LangChain Documentation.Loading PDF DataWe've compiled the Giskard AI tool's documentation into a PDF and subsequently partitioned the data.loader = PyPDFLoader("/kaggle/input/giskard-documentation/Giskard Documentation.pdf")pages = loader.load_and_split()Below are the code snippets if you prefer to work with either CSV or File Directory file formats.Loading CSV Datafrom langchain.document_loaders.csv_loader import CSVLoader loader = CSVLoader(“Insert the path to your CSV dataset here”) data = loader.load()Loading File Directory Datafrom langchain.document_loaders import DirectoryLoader loader = DirectoryLoader('../', glob="**/*.md") docs = loader.load()Step 5: Indexing The DatasetWe will be creating an index using FAISS (Facebook AI Similarity Search), which is a library developed by Facebook AI for efficiently searching similarities in large datasets, especially used with vectors from machine learning models.We will be converting those documents into vector embeddings using OpenAIEmbeddings(). This indexed data can then be used for efficient similarity searches later on.faiss_index = FAISS.from_documents(pages, OpenAIEmbeddings())Here are some alternative indexing options you might consider.Indexing using Pineconeimport osimport pineconefrom langchain.schema import Documentfrom langchain.embeddings.openai import OpenAIEmbeddingsfrom langchain.vectorstores import Pinecone pinecone.init(    api_key=os.environ["PINECONE_API_KEY"], environment=os.environ["PINECONE_ENV"]) embeddings = OpenAIEmbeddings() pinecone.create_index("langchain-self-retriever-demo", dimension=1536)Indexing using Chromaimport os import getpass from langchain.schema import Document from langchain.embeddings.openai import OpenAIEmbeddings from langchain.vectorstores import Chroma os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:") embeddings = OpenAIEmbeddings() vectorstore = Chroma.from_documents(docs, embeddings)Step 6: Asking The Model Some QuestionsThere are multiple methods by which we can retrieve our data from our model.Similarity SearchIn the context of large language models (LLMs) and natural language processing, similarity search is often about finding sentences, paragraphs, or documents that are semantically similar to a given sentence or piece of text.query = "What is Giskard?" docs = faiss_index.similarity_search(query) print(docs[0].page_content)Similarity Search Output: Why Giskard?Giskard is an open-source testing framework dedicated to ML models, covering any Python model, from tabular to LLMs.Testing Machine Learning applications can be tedious. Since ML models depend on data, testing scenarios depend on the domain specificities and are often infinite. Where to start testing? Which tests to implement? What issues to cover? How to implement the tests?At Giskard, we believe that Machine Learning needs its own testing framework. Created by ML engineers for ML engineers, Giskard enables you to:Scan your model to find dozens of hidden vulnerabilities: The Giskard scan automatically detects vulnerability issues such as performance bias, data leakage, unrobustness, spurious correlation, overconfidence, underconfidence, unethical issue, etc. Instantaneously generate domain-specific tests: Giskard automatically generates relevant tests based on the vulnerabilities detected by the scan. You can easily customize the tests depending on your use case by defining domain-specific data slicers and transformers as fixtures of your test suites.Leverage the Quality Assurance best practices of the open-source community: The Giskard catalog enables you to easily contribute and load data slicing & transformation functions such as AI-based detectors (toxicity, hate, etc.), generators (typos, paraphraser, etc.), or evaluators. Inspired by the Hugging Face philosophy, the aim of Giskard is to become.LLM Chainsmodel = OpenAI(model_name="gpt-3.5-turbo") my_chain = load_qa_with_sources_chain(model, chain_type="refine") query = "What is Giskard?" documents = faiss_index.similarity_search(query) result = my_chain({"input_documents": pages, "question": query})LLM Chain Output:Based on the additional context provided, Giskard is a Python package or library that provides tools for wrapping machine learning models, testing, debugging, and inspection. It supports models from various machine learning libraries such as HuggingFace, PyTorch, TensorFlow, or Scikit-learn. Giskard can handle classification, regression, and text generation tasks using tabular or text data.One notable feature of Giskard is the ability to upload models to the Giskard server. Uploading models to the server allows users to compare their models with others using a test suite, gather feedback from colleagues, debug models effectively in case of test failures, and develop new tests incorporating additional domain knowledge. This feature enables collaborative model evaluation and improvement. It is worth highlighting that the provided context mentions additional ML libraries, including Langchain, API REST, and LightGBM, but their specific integration with Giskard is not clearly defined.Sources:Giskard Documentation.pdfAPI Reference (for Dataset methods)Kaggle: /kaggle/input/giskard-documentation/Giskard Documentation.pdfConclusionLangChain effectively bridges the gap between advanced language models and the need for data privacy. Throughout this article, we have highlighted its capability to train models on private data, ensuring both insightful results and data security. One thing is for sure though, as AI continues to grow, tools like LangChain will be essential for balancing innovation with user trust.Author BioMostafa Ibrahim is a dedicated software engineer based in London, where he works in the dynamic field of Fintech. His professional journey is driven by a passion for cutting-edge technologies, particularly in the realms of machine learning and bioinformatics. When he's not immersed in coding or data analysis, Mostafa loves to travel.Medium
Read more
  • 0
  • 0
  • 167

article-image-build-an-ai-based-personal-financial-advisor-with-langchain
Louis Owen
09 Oct 2023
11 min read
Save for later

Build an AI-based Personal Financial Advisor with LangChain

Louis Owen
09 Oct 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionManaging personal finances is a cumbersome process. Receipts, bank statements, credit card bills, and expense records accumulate quickly. Despite our best intentions, many of us often struggle to maintain a clear overview of our financial health. It's not uncommon to overlook expenses or procrastinate on updating our budgets. Inevitably, this leads to financial surprises and missed opportunities for financial growth.Even when we diligently track our expenses, the data can be overwhelming. Analyzing every transaction to identify patterns, pinpoint areas for improvement, and set financial goals is no easy feat. It's challenging to answer questions like, "Am I spending too much on entertainment?" or "Is my investment portfolio well-balanced?" or "Should I cut back on dining out?" or "Do I need to limit socializing with friends?" and "Is it time to increase my investments?"Imagine having a personal assistant capable of automating these financial tasks, effortlessly transforming your transaction history into valuable insights. What if, at the end of each month, you received comprehensive financial analyses and actionable recommendations? Thanks to the rapid advancements in generative AI, this dream has become a reality, made possible by the incredible capabilities of LLM. No more endless hours of spreadsheet tinkering or agonizing over budgeting apps.In this article, we'll delve into the development of a personal financial advisor powered by LangChain. This virtual assistant will not only automate the tracking of your finances but also provide tailored recommendations based on your unique spending patterns and financial goals.Building an AI-powered personal financial advisor is an exciting endeavor. Here's an overview of how the personal financial advisor operates:Data Input: Users upload their personal transaction history, which includes details of income, expenses, investments, and savings.Data Processing: LangChain with LLM in the backend will process the data, categorize expenses, identify trends, and compare your financial activity with your goals and benchmarks.Financial Analysis: The advisor generates a detailed financial analysis report, highlighting key insights such as spending habits, saving potential, and investment performance.Actionable Recommendations: The advisor will also provide actionable recommendations for the user. It can suggest adjustments to your budget, recommend investment strategies, and even propose long-term financial plans.The benefits of having an AI-powered personal financial advisor are numerous:Time-Saving: No more tedious data entry and manual budget tracking. The advisor handles it all, giving you more time for what matters most in your life.Personalized Insights: The advisor tailors recommendations based on your unique financial situation, ensuring they align with your goals and aspirations.Financial Confidence: With regular updates and guidance, you gain a better understanding of your financial health and feel more in control of your money.Long-Term Planning: The advisor’s ability to provide insights into long-term financial planning ensures you're well-prepared for your future.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to build your AI-based personal financial advisor with LangChain!What is LangChain?LangChain is developed to harness the incredible potential of LLM, LangChain enables the creation of applications that are not only context-aware but also capable of reasoning, all while maintaining a user-friendly and modular approach.LangChain is more than just a framework; it's a paradigm shift in the world of language model-driven applications. Here's a closer look at what makes LangChain a transformative force:Context-Aware Applications: LangChain empowers applications to be context-aware. This means that these applications can connect to various sources of context, such as prompt instructions, few-shot examples, or existing content, to enhance the depth and relevance of their responses. Whether you're seeking answers or recommendations, LangChain ensures that responses are firmly grounded in context.Reasoning Abilities: One of LangChain's standout features is its ability to reason effectively. It leverages language models to not only understand context but also make informed decisions. These decisions can range from determining how to answer a given question based on the provided context to deciding what actions to take next. LangChain doesn't just provide answers; it understands the "why" behind them.Why LangChain?The power of LangChain lies in its value propositions, which make it an indispensable tool for developers and businesses looking to harness the potential of language models:Modular Components: LangChain offers a comprehensive set of abstractions for working with language models. These abstractions are not only powerful but also modular, allowing developers to work with them seamlessly, whether they're using the entire LangChain framework or not. This modularity simplifies the development process and promotes code reuse.Off-the-Shelf Chains: LangChain provides pre-built, structured assemblies of components known as "off-the-shelf chains." These chains are designed for specific high-level tasks, making it incredibly easy for developers to kickstart their projects. Whether you're a seasoned AI developer or a newcomer, these pre-configured chains save time and effort.Customization and Scalability: While off-the-shelf chains are fantastic for quick starts, LangChain doesn't restrict you. The framework allows for extensive customization, enabling developers to tailor existing chains to their unique requirements or even create entirely new ones. This flexibility ensures that LangChain can accommodate a wide range of applications, from simple chatbots to complex AI systems.LangChain isn't just a run-of-the-mill framework; it's a versatile toolkit designed to empower developers to create sophisticated language model-powered applications. At the heart of LangChain is a set of interconnected modules, each serving a unique purpose. These modules are the building blocks that make LangChain a powerhouse for AI application development.Model I/O: At the core of LangChain's capabilities is its ability to interact seamlessly with language models. This module facilitates communication with these models, enabling developers to leverage their natural language processing prowess effortlessly.Retrieval: LangChain recognizes that real-world applications require access to relevant data. The Retrieval module allows developers to integrate application-specific data sources into their projects, enhancing the context and richness of responses.Chains: Building upon the previous modules, Chains bring structure and order to the development process. Developers can create sequences of calls, orchestrating interactions with language models and data sources to achieve specific tasks or goals.Agents: Let chains choose which tools to use given high-level directives. Agents take the concept of automation to a new level. They allow chains to make intelligent decisions about which tools to employ based on high-level directives. This level of autonomy streamlines complex processes and enhances application efficiency.Memory: Memory is vital for continuity in applications. This module enables LangChain applications to remember and retain their state between runs of a chain, ensuring a seamless user experience and efficient data handling.Callbacks: Transparency and monitoring are critical aspects of application development. Callbacks provide a mechanism to log and stream intermediate steps of any chain, offering insights into the inner workings of the application and facilitating debugging.Building the Personal Financial AdvisorLet’s start building our personal financial advisor with LangChain! For the sake of simplicity, let’s consider only three data sources: monthly credit card statements, bank account statements, and cash expense logs. The following is an example of the data format for each of the sources. ## Monthly Credit Card Statement Date: 2023-09-01 Description: Grocery Store Amount: $150.00 Balance: $2,850.00 Date: 2023-09-03 Description: Restaurant Dinner Amount: $50.00 Balance: $2,800.00 Date: 2023-09-10 Description: Gas Station Amount: $40.00 Balance: $2,760.00 Date: 2023-09-15 Description: Utility Bill Payment Amount: $100.00 Balance: $2,660.00 Date: 2023-09-20 Description: Salary Deposit Amount: $3,000.00 Balance: $5,660.00 Date: 2023-09-25 Description: Online Shopping Amount: $200.00 Balance: $5,460.00 Date: 2023-09-30 Description: Investment Portfolio Contribution Amount: $500.00 Balance: $4,960.00 ## Bank Account Statement Date: 2023-08-01 Description: Rent Payment Amount: $1,200.00 Balance: $2,800.00 Date: 2023-08-05 Description: Grocery Store Amount: $200.00 Balance: $2,600.00 Date: 2023-08-12 Description: Internet and Cable Bill Amount: $80.00 Balance: $2,520.00 Date: 2023-08-15 Description: Freelance Gig Income Amount: $700.00 Balance: $3,220.00 Date: 2023-08-21 Description: Dinner with Friends Amount: $80.00 Balance: $3,140.00 Date: 2023-08-25 Description: Savings Account Transfer Amount: $300.00 Balance: $3,440.00 Date: 2023-08-29 Description: Online Shopping Amount: $150.00 Balance: $3,290.00 ## Cash Expense Log Date: 2023-07-03 Description: Coffee Shop Amount: $5.00 Balance: $95.00 Date: 2023-07-10 Description: Movie Tickets Amount: $20.00 Balance: $75.00 Date: 2023-07-18 Description: Gym Membership Amount: $50.00 Balance: $25.00 Date: 2023-07-22 Description: Taxi Fare Amount: $30.00 Balance: -$5.00 (Negative balance indicates a debt) Date: 2023-07-28 Description: Bookstore Amount: $40.00 Balance: -$45.00 Date: 2023-07-30 Description: Cash Withdrawal Amount: $100.00 Balance: -$145.00To create our personal financial advisor, we’ll use the chat model interface provided by LangChain. There are several important components to build a chatbot with LangChain:`chat model`: Chat models are essential for creating conversational chatbots. These models are designed to generate human-like responses in a conversation. You can choose between chat models and LLMs (Large Language Models) depending on the tone and style you want for your chatbot. Chat models are well-suited for natural, interactive conversations.`prompt template`: Prompt templates help you construct prompts for your chatbot. They allow you to combine default messages, user input, chat history, and additional context to create meaningful and dynamic conversations. Using prompt templates makes it easier to generate responses that flow naturally in a conversation.`memory`: Memory in a chatbot context refers to the ability of the bot to remember information from previous parts of the conversation. This can be crucial for maintaining context and providing relevant responses. Memory types can vary depending on your use case, and they can include short-term and long-term memory.`retriever` (optional): Retrievers are components that help chatbots access domain-specific knowledge or retrieve information from external sources. If your chatbot needs to provide detailed, domain-specific information, a retriever can be a valuable addition to your system.First, we need to set the API key for our LLM. We’ll use OpenAI in this example.import os os.environ["OPENAI_API_KEY"] = “your openai key”Then, we can simply load the necessary chat modules from LangChain. from langchain.schema import ( AIMessage,    HumanMessage,    SystemMessage ) from langchain.chat_models import ChatOpenAIThe ChatOpenAI is the main class that connects with the OpenAI LLM. We can pass `HumanMessage` and `SystemMessage` to this class and it will return the response from the LLM in the type of `AIMessage`.chat = ChatOpenAI(model_name=”gpt-3.5-turbo”) messages = [SystemMessage(content=prompt),                    HumanMessage(content=data)] chat(messages)Let’s see the following example where we pass the prompt along with the data and the LLM returns the response via the ChatOpenAI object. Boom! We just got our first analysis and recommendation from our personal financial advisor. This is a very simple example of how to create our personal financial advisor. Of course, there’s still a lot of room for improvement. For example, currently, we need to pass manually the relevant data sources as the HumanMessage. However, as mentioned before, LangChain provides a built-in class to perform retrieval. This means that we can just create another script to automatically dump all of the relevant data into some sort of document or even database, and then LangChain can directly read the data directly from there. Hence, we can get automated reports every month without needing to manually input the relevant data.ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned what is LangChain, what it is capable of, and how to build a personal financial advisor with LangChain. Hope the best for your experiment in creating your personal financial advisor and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 2126

article-image-kickstarting-your-journey-with-azure-openai
Shankar Narayanan
06 Oct 2023
8 min read
Save for later

Kickstarting your journey with Azure OpenAI

Shankar Narayanan
06 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionArtificial intelligence is more than just a buzzword now. The transformative force of AI is driving innovation, reshaping industries, and opening up new world possibilities.At the forefront of this revolution stands Azure Open AI. It is a collaboration between Open AI and Microsoft's Azure cloud platform. For the business, developers, and AI enthusiasts, it is an exciting opportunity to harness the incredible potential of AI.Let us know how we can kick-start our journey with Azure Open AI.Understanding Azure Open AI Before diving into the technical details, it is imperative to understand the broader context of artificial intelligence and why Azure OpenAI is a game changer.From recommendation systems on any streaming platform like Netflix to voice-activated virtual assistants like Alexa or Siri, AI is everywhere around us.Businesses use artificial intelligence to enhance customer experiences, optimize operations, and gain a competitive edge. Here, Azure Open AI stands as a testament to the growing importance of artificial intelligence. The collaboration between Microsoft and Open AI brings the cloud computing prowess of Microsoft with the state-of-the-art AI model of Open AI. It results in the creation of AI-driven services and applications with remarkable ease.Core components of Azure Open AI One must understand the core components of Azure OpenAI.●  Azure cloud platformIt is a cloud computing platform of Microsoft. Azure offers various services, including databases, virtual machines, and AI services. One can achieve reliability, scalability, and security with Azure while making it ideal for AI deployment and development.●  Open AI's GPT modelsOpen AI's GPT models reside at the heart of Open AI models. The GPT3 is the language model that can understand context, generate human-like text, translate languages, write codes, or even answer questions. Such excellent capabilities open up many possibilities for businesses and developers.After getting an idea of Azure Open AI, here is how you can start your journey with Azure Open AI.Integrating Open AI API in Azure Open AI projects.To begin with the Azure Open AI journey, one has to access an Azure account. You can always sign up for a free account if you don't have one. It comes with generous free credits to help you get started.You can set up the Azure OpenAI environment only by following these steps as soon as you have your Azure account.●  Create a new projectYou can start creating new projects as you log into your Azure portal. This project would be the workspace for all your Azure Open AI endeavors. Make it a point to choose the configuration and services that perfectly align with the AI project requirements.Yes, Azure offers a wide range of services. Therefore, one can tailor their project to meet clients' specific needs.●  Access Open AI APIThe seamless integration with the powerfulAPI of Open AI is one of the standout features of Azure Open AI. To access, you must obtain an API key for the OpenAi platform. One has to go through these steps:Visit the Open AI platform website.Create an account if you don't have one, or simply sign-inAfter logging in, navigate to the API section to follow the instructions to get the API key.Ensure you keep your API key secure and safe since it grants access to Open AI language models. However, based on usage, it can incur charges.●  Integrate Open AI APIWith the help of the API key, one can integrate Open AI API with the Azure project.Here is a Python code demonstrating how to generate text using Open AI's GPT-3 model.import openai openai.api_key = 'your_openai_api_key' response = openai.Completion.create( engine="text-davinci-003", prompt="Once upon a time,", max_tokens=150 ) print(response.choices[0].text.strip())The code is one of the primary examples. But it helps to showcase the power and the simplicity of integrating Azure Open AI into the project. One can utilize this capability to answer questions, generate content, and more depending on the specific needs.Best practices and tips for successAzureOpenAI offers the most powerful platform for AI development. However, success in projects related to artificial intelligence requires adherence to best practices and thoughtful approaches.Let us explore some tips that can help us navigate our journey effectively.● Data qualityEvery AI model relies on high-quality data. Hence, one must ensure that the input data is well structured, clean, and represents the problem one tries to solve. Such quality of data directly impacts the reliability and accuracy of AI applications.●  Experiment and iterateThe development process is iterative. Make sure you experiment with tweak parameters and different prompts, and try out new ideas. Each iteration brings valuable insights while moving you closer to your desired outcome.● Optimise and regularly monitorThe artificial intelligence model based on machine learning certainly benefits from continuous optimization and monitoring. As you regularly evaluate your model's performance, you will be able to identify the areas that require improvement and fine-tune your approach accordingly. Such refinement ensures that your AI application stays adequate and relevant over time.● Stay updated with trends.Artificial intelligence is dynamic, with new research findings and innovative practices emerging regularly. One must stay updated with the latest trends and research papers and regularly use the best energy practices. Continuous learning would help one to stay ahead in the ever-evolving world of artificial intelligence technology.Real-life applications of Azure Open AIHere are some real-world applications of Azure OpenAI.●  AI-powered chatbotsOne can utilize Azure OpenAI to create intelligent chatbots that streamline support services and enhance customer interaction. Such chatbots provide a seamless user experience while understanding natural language.Integrating Open AI into a chatbot for natural language processing through codes:# Import necessary libraries for Azure Bot Service and OpenAI from flask import Flask, request from botbuilder.schema import Activity import openai # Initialize your OpenAI API key openai.api_key = 'your_openai_api_key' # Initialize the Flask app app = Flask(__name__) # Define a route for the chatbot @app.route("/api/messages", methods=["POST"]) def messages():    # Parse the incoming activity from the user    incoming_activity = request.get_json()    # Get user's message    user_message = incoming_activity["text"]    # Use OpenAI API to generate a response    response = openai.Completion.create(        engine="text-davinci-003",        prompt=f"User: {user_message}\nBot:",        max_tokens=150    )    # Extract the bot's response from OpenAI API response    bot_response = response.choices[0].text.strip()    # Create an activity for the bot's response    bot_activity = Activity(        type="message",        text=bot_response    )    # Return the bot's response activity    return bot_activity.serialize() # Run the Flask app if __name__ == "__main__":    app.run(port=3978)●  Content generationEvery content creator can harness this for generating creative content, drafting articles, or even brainstorming ideas. The ability of the AI model helps in understanding the context, ensuring that the generated content matches the desired tone and theme.Here is how one can integrate Azure OpenAI for content generation.# Function to generate blog content using OpenAI API def generate_blog_content(prompt):    response = openai.Completion.create(        engine="text-davinci-003",        prompt=prompt,        max_tokens=500    )    return response.choices[0].text.strip() # User's prompt for generating blog content user_prompt = "Top technology trends in 2023:" # Generate blog content using Azure OpenAI generated_content = generate_blog_content(user_prompt) # Print the generated content print("Generated Blog Content:") print(generated_content)The 'generate_blog_content' function takes the user prompts as input while generating blog content related to the provided prompt.●  Language translation servicesAzureOpenAI can be employed for building extensive translation services, helping translate text from one language to another. It opens the door for global communication and understanding. ConclusionAzure Open AI empowers businesses and developers to leverage the power of artificial intelligence in the most unprecedented ways. Whether generating creative content or building intelligent chatbots, Azure Open AI provides the necessary resources and tools to bring your ideas to life.Experiment, Learn, and Innovate with Azure AI.Author BioShankar Narayanan (aka Shanky) has worked on numerous different cloud and emerging technologies like Azure, AWS, Google Cloud, IoT, Industry 4.0, and DevOps to name a few. He has led the architecture design and implementation for many Enterprise customers and helped enable them to break the barrier and take the first step towards a long and successful cloud journey. He was one of the early adopters of Microsoft Azure and Snowflake Data Cloud. Shanky likes to contribute back to the community. He contributes to open source is a frequently sought-after speaker and has delivered numerous talks on Microsoft Technologies and Snowflake. He is recognized as a Data Superhero by Snowflake and SAP Community Topic leader by SAP.
Read more
  • 0
  • 0
  • 110

article-image-build-a-language-converter-using-google-bard
Aryan Irani
06 Oct 2023
7 min read
Save for later

Build a Language Converter using Google Bard

Aryan Irani
06 Oct 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionIn this blog, we will be taking a look at building a language converter inside of a Google Sheet using Google Bard. We are going to achieve this using the PaLM API and Google Apps Script.We are going to use a custom function inside which we pass the origin language of the sentence, followed by the target language and the sentence you want to convert. In return, you will get the converted sentence using Google Bard.Sample Google SheetFor this blog, I will be using a very simple Google Sheet that contains the following columns:Sentence that has to be convertedOrigin language of the sentenceTarget Language of the sentenceConverted SentenceIf you want to work with the Google Sheet, click here. Once you make a copy of the Google Sheet you have to go ahead and change the API key in the Google Apps Script code.Step 1: Get the API keyCurrently, PaLM API hasn’t been released for public use but to access it before everybody does, you can apply for the waitlist by clicking here. If you want to know more about the process of applying for MakerSuite and PaLM API, you can check the YouTube tutorial below.Once you have access, to get the API key, we have to go to MakerSuite and go to the Get API key section. To get the API key, follow these steps:Go to MakerSuite or click here.On opening the MakerSuite you will see something like this3. To get the API key go ahead and click on Get API key on the left side of the page.4. On clicking the Get API key, you will see something like this where you can create your API key.5. To create the API key go ahead and click on Create API key in the new project.On clicking Create API Key, in a few seconds, you will be able to copy the API key.Step 2: Write the Automation ScriptWhile you are in the Google Sheet, let’s open up the Script Editor to write some Google Apps Script. To open the Script Editor, follow these steps:1. Click on Extensions and open the Script Editor.2. This brings up the Script Editor as shown below.We have reached the script editor lets code.Now that we have the Google Sheet and the API key ready, lets go ahead and write the Google Apps Script to integrate the custom function inside the Google Sheet.function BARD(sentence,origin_language,target_lanugage) { var apiKey = "your_api_key"; var apiUrl = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText"; We start out by opening a new function BARD() inside which we will declare the API key that we just copied. After declaring the API key we go ahead and declare the API endpoint that is provided in the PaLM API documentation. You can check out the documentation by checking out the link given below.We are going to be receiving the prompt from the Google Sheet from the BARD function that we just created.Generative Language API | PaLM API | Generative AI for DevelopersThe PaLM API allows developers to build generative AI applications using the PaLM model. Large Language Models (LLMs)…developers.generativeai.googlevar url = apiUrl + "?key=" + apiKey; var headers = {   "Content-Type": "application/json" };Here we create a new variable called url inside which we combine the API URL and the API key, resulting in a complete URL that includes the API key as a parameter. The headers specify the type of data that will be sent in the request which in this case is “application/json”.var prompt = {     'text': "Convert this sentence"+ sentence + "from"+origin_language + "to"+target_lanugage   } var requestBody = {   "prompt": prompt }Now we come to the most important part of the code which is declaring the prompt. For this blog, we will be designing the prompt in such a way that we get back only the converted sentence. This prompt will accept the variables from the Google Sheet and in return will give the converted sentence.Now that we have the prompt ready, we go ahead and create an object that will contain this prompt that will be sent in the request to the API. var options = {   "method": "POST",   "headers": headers,   "payload": JSON.stringify(requestBody) };Now that we have everything ready, it's time to define the parameters for the HTTP request that will be sent to the PaLM API endpoint. We start out by declaring the method parameter which is set to POST which indicates that the request will be sending data to the API.The headers parameter contains the header object that we declared a while back. Finally, the payload parameter is used to specify the data that will be sent in the request.These options are now passed as an argument to the UrlFetchApp.fetch function which sends the request to the PaLM API endpoint, and returns the response that contains the AI generated text. var response = UrlFetchApp.fetch(url, options); var data = JSON.parse(response.getContentText()); var output = data.candidates[0].output; Logger.log(output); return output;In this case, we just have to pass the url and options variable inside the UrlFetchApp.fetch function. Now that we have sent a request to the PaLM API endpoint we get a response back. In order to get an exact response we are going to be parsing the data.The getContentText() function is used to extract the text content from the response object. Since the response is in JSON format, we use the JSON.parse function to convert the JSON string into an object.The parsed data is then passed to the final variable output, inside which we get the first response out of multiple other drafts that Bard generates for us. On getting the first response, we return the output back to the Google Sheet.Our code is complete and good to go.Step 3: Check the outputIt's time to check the output and see if the code is working according to what we expected. To do that go ahead and save your code and run the BARD() function.On running the code, let's go back to the Google Sheet, use the custom function, and pass the prompt inside it.Here I have passed the original sentence, followed by the Origin Language and the Target Language.On successful execution, we can see that Google Bard has successfully converted the sentences using the PaLM API and Google Apps Script.ConclusionThis is just another interesting example of how we can Integrate Google Bard into Google Workspace using Google Apps Script and the PaLM API. I hope you have understood how to use the PaLM API and Google Apps Script to create a custom function that acts as a Language converter. You can get the code from the GitHub link below.Google-Apps-Script/Bard_Lang.js at master · aryanirani123/Google-Apps-ScriptCollection of Google Apps Script Automation scripts written and compiled by Aryan Irani. …github.comFeel free to reach out if you have any issues/feedback at aryanirani123@gmail.com.Author BioAryan Irani is a Google Developer Expert for Google Workspace. He is a writer and content creator who has been working in the Google Workspace domain for three years. He has extensive experience in the area, having published 100 technical articles on Google Apps Script, Google Workspace Tools, and Google APIs.
Read more
  • 0
  • 0
  • 165

article-image-google-bard-everything-you-need-to-know-to-get-started
Sangita Mahala
05 Oct 2023
6 min read
Save for later

Google Bard: Everything You Need to Know to Get Started

Sangita Mahala
05 Oct 2023
6 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionGoogle Bard, a generative AI conversational chatbot. Initially from the LaMDA family of large language models and later the PaLM LLMs. It will follow your instructions and complete your requests in a better way. Bard uses its knowledge to answer your questions in an informative manner. It can generate different creative text formats, like poems, code, scripts, emails, letters, etc. Currently, it is available in 238 countries and 46 languages.Bard is a powerful tool which can be used for many different things, including:Writing and editing contentResearching and learning new thingsTranslating languagesGenerating new creative ideasAnswering questionsWhat is Google Bard and How does it work?Bard is a large language model, commonly referred to as a chatbot or conversational AI, that has been programmed to be extensive and informative. It is able to communicate and generate human-like text in response to a wide range of prompts and questions.Bard operates by using a method known as deep learning. Artificial neural networks are used in deep learning to learn from data. Deep learning is a subset of machine learning. The structure and operation of the human brain served as the inspiration for neural networks, which are capable of learning intricate patterns from data.The following illustrates how Bard works:You enter a command or query into Bard.The input or query is processed by Bard's neural network, which then produces a response.The reply from Bard is then shown to you.How to get started with Bard?It’s pretty simple and easy to get started using Google Bard. The following steps will let you know how you will be able to start your journey in Google Bard.Step-1Go to the Google Bard website by clicking this link: https://bard.google.com/Step-2Go to the top right corner of your screen and then click on the “Sign in” button.Step-3Once you are signed in, you can start typing your prompts or queries into the text box at the bottom of the screen.Step-4Your prompt or query will trigger Bard to produce an answer, which you may then read and evaluate.For Example:You can provide the prompt to Google Bard such as “Write a 'Hello, World!' program in the Rust programming language.”Prompt:Common Bard commands and how to use themGoogle Bard does not have any specific commands, but you can use certain keywords and phrases to get Bard to perform certain tasks. For example, you can use the following keywords and phrases to get Bard to:Create many creative text forms, such as "Write a script for...", "Write a poem about...", and "Write a code for...".Bard will perfectly answer your questions in a comprehensive and informative way: "What is the capital of India?", "How do I build a website?", "What are the benefits of using Bard?" Examples of what Bard can do and how to use it for specific tasksHere are some examples of what Bard can do and how to use it for specific tasks:Generate creative text formats: Bard allows you to generate a variety of unique text formats, including code, poems, emails, and letters. You have to simply input the required format, followed by a question or query, to get the things done. For example, to generate an email to your manager requesting a raise in salary, you would type “Write an email to your manager asking for a raise."Prompt:Answer your questions in a comprehensive and informative way: No matter how difficult or complex your query might be, Bard can help you to find the solution. So you have to simply enter your query into the text box, and Bard will generate a response likewise. For example, to ask Bard what is the National Flower of India is, you would type "What is the National Flower of India?".Prompt: Translate languages: Bard will allow to convert text between different languages. To do this, simply type the text that you want to translate into the text box, followed by the provided language from your side. For example, to translate the sentence: I am going to the store to buy some groceries into Hindi, you would type "I am going to the store to buy some groceries to Hindi".Prompt:How Bard Can Be Used for Different PurposesA writer can use Bard to generate new concepts for fresh stories or articles or to summarize research results.Students can utilize Bard to create essays and reports, get assistance with their assignments, and master new topics.A business owner can use Bard to produce marketing materials, develop chatbots for customer support, or perform data analysis.Software developers can use Bard to produce code, debug applications, or find solutions for technical issues.The future of Google BardGoogle Bard is probably going to get increasingly more capable and adaptable as it keeps developing and acquiring knowledge. It might be utilized to produce new works of art and entertainment, advance scientific research, and provide solutions to some of the most critical global problems.It's also crucial to remember that Google Bard is not the only significant language model being developed. Similar technologies are being created as well by various other companies, such as Microsoft and OpenAI. This competition is likely to drive innovation and lead to even more powerful and sophisticated language models in the future.Overall, the future of Google Bard and other substantial language models seems quite bright overall. These innovations have the power to completely transform the way we study, work, and produce. There is no doubt that these technologies have the potential to improve the world, but it is necessary to use them effectively.ConclusionGoogle Bard is a powerful AI tool that has the potential to be very beneficial, including writing and editing content, researching and learning new things, translating languages, generating new creative ideas, and answering questions. Being more productive and saving time are two of the greatest advantages of utilizing Google Bard. This can free up your time so that you can concentrate on other things, including developing new ideas or enhancing your skills. Bard can assist you in finding and understanding the data quickly and easily because it has access to a wide amount of information. It has the potential to be a game-changer for many people. If you are looking for a way to be more productive, I encourage you to try using Bard.Author BioSangita Mahala is a passionate IT professional with an outstanding track record, having an impressive array of certifications, including 12x Microsoft, 11x GCP, 2x Oracle, and LinkedIn Marketing Insider Certified. She is a Google Crowdsource Influencer and IBM champion learner gold. She also possesses extensive experience as a technical content writer and accomplished book blogger. She is always Committed to staying with emerging trends and technologies in the IT sector.
Read more
  • 0
  • 0
  • 130
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-build-a-clone-of-yourself-with-large-language-models-llms
Louis Owen
05 Oct 2023
13 min read
Save for later

Build a Clone of Yourself with Large Language Models (LLMs)

Louis Owen
05 Oct 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Introduction"White Christmas," a standout sci-fi episode from the Black Mirror series, serves as a major source of inspiration for this article. In this episode, we witness a captivating glimpse into a potential future application of Artificial Intelligence (AI), particularly considering the timeframe when the show was released. The episode introduces us to "Cookies," digital replicas of individuals that piqued the author's interest.A "Cookie" is a device surgically implanted beneath a person's skull, meticulously replicating their consciousness over the span of a week. Subsequently, this replicated consciousness is extracted and transferred into a larger, egg-shaped device, which can be connected to a computer or tablet for various purposes.Back when this episode was made available to the public in approximately 2014, the concept seemed far-fetched, squarely in the realm of science fiction. However, what if I were to tell you that we now have the potential to create our own clones akin to the "Cookies" using Large Language Models (LLMs)? You might wonder how this is possible, given that LLMs primarily operate with text. Fortunately, we can bridge this gap by extending the capabilities of LLMs through the integration of a Text-to-Speech module.There are two primary approaches to harnessing LLMs for this endeavor: fine-tuning your own LLM and utilizing a general-purpose LLM (whether open-source or closed-source). Fine-tuning, though effective, demands a considerable investment of time and resources. It involves tasks such as gathering and preparing training data, fine-tuning the model through multiple iterations until it meets our criteria, and ultimately deploying the final model into production. Conversely, general LLMs have limitations on the length of input prompts (unless you are using an exceptionally long-context model like Antropic's Claude). Moreover, to fully leverage the capabilities of general LLMs, effective prompt engineering is essential. However, when we compare these two approaches, utilizing general LLMs emerges as the easier path for creating a Proof of Concept (POC). If the aim is to develop a highly refined model capable of replicating ourselves convincingly, then fine-tuning becomes the preferred route.In the course of this article, we will explore how to harness one of the general LLMs provided by AI21Labs and delve into the art of creating a digital clone of oneself through prompt engineering. While we will touch upon the basics of fine-tuning, we will not delve deeply into this process, as it warrants a separate article of its own.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to build a clone of yourself with LLM!Glimpse of Fine-tuning your LLMAs mentioned earlier, we won't delve into the intricate details of fine-tuning a Large Language Model (LLM) to achieve our objective of building a digital clone of ourselves. Nevertheless, in this section, we'll provide a high-level overview of the steps involved in creating such a clone through fine-tuning an LLM.1. Data CollectionThe journey begins with gathering all the relevant data needed for fine-tuning the LLM. This dataset should ideally comprise our historical conversational data, which can be sourced from various platforms like WhatsApp, Telegram, LINE, email, and more. It's essential to cast a wide net and collect as much pertinent data as possible to ensure the model's accuracy in replicating our conversational style and nuances.2. Data PreparationOnce the dataset is amassed, the next crucial step is data preparation. This phase involves several tasks:●  Data Formatting: Converting the collected data into the required format compatible with the fine-tuning process.●  Noise Removal: Cleaning the dataset by eliminating any irrelevant or noisy information that could negatively impact the model's training.●  Resampling: In some cases, it may be necessary to resample the data to ensure a balanced and representative dataset for training.3. Model TrainingWith the data prepared and in order, it's time to proceed to the model training phase. Modern advances in deep learning have made it possible to train LLMs on consumer-grade GPUs, offering accessibility and affordability, such as via QLoRA. During this stage, the LLM learns from the provided dataset, adapting its language generation capabilities to mimic our conversational style and patterns.4. Iterative RefinementFine-tuning an LLM is an iterative process. After training the initial model, we need to evaluate its performance. This evaluation may reveal areas for improvement. It's common to iterate between model training and evaluation, making incremental adjustments to enhance the model's accuracy and fluency.5. Model EvaluationThe evaluation phase is critical in assessing the model's ability to replicate our conversational style and content accurately. Evaluations may include measuring the model's response coherence, relevance, and similarity to our past conversations.6. DeploymentOnce we've achieved a satisfactory level of performance through multiple iterations, the next step is deploying the fine-tuned model. Deploying an LLM is a complex task that involves setting up infrastructure to host the model and handle user requests. An example of a robust inference server suitable for this purpose is Text Generation Inference. You can refer to my other article for this. Deploying the model effectively ensures that it can be accessed and used in various applications.Building the Clone of Yourself with General LLMLet’s start learning how to build the clone of yourself with general LLM through prompt engineering! In this article, we’ll use j2-ultra, the biggest and most powerful model provided by AI21Labs. Note that AI21Labs gives us a free trial for 3 months with $90 credits. These free credits is very useful for us to build a POC for this project. The first thing we need to do is to create the prompt and test it in the playground. To do this, you can log in with your AI21Labs account and go to the AI21Studio. If you don’t have an account yet, you can create one by just following the steps provided on the web. It’s very straightforward. Once you’re on the Studio page, go to the Foundation Models page and choose the j2-ultra model. Note that there are three foundation models provided by AI21Labs. However, in this article, we’ll use j2-ultra which is the best one.Once we’re in the playground, we can experiment with the prompt that we want to try. Here, I provided an example prompt that you can start with. What you need to do is to adjust the prompt with your own information. Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hi, what's up? Louis: Hey, doing good here! How are u? User: All's good. Just wondering, I knew that you're into NLP, can you please give me some recommendation on how to learn? Louis: Sure thing man! I personally learned Data Science through online courses, competitions, internship, and side-projects. However, my top recommendation is to create your own personal projects and joining competitions. You can learn a lot from those! User: Nice. What personal projects to start? Louis: You can start with the topic that you're really interested at. For example, if you're interested at soccer, you can maybe create a data analysis on how one soccer team strategy can gives a better chance for them to win their matches. User: Awesome! thanks man, will ping you again if I have any doubts. Is it okay? Louis: Absolutely! Feel free, good day! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey, I stumbled upon your IG and realized that you're a Vegan?! Louis: Oh yeah man. I'm a Vegan since 2007! User: Awesome! Out of curiosity, what made you to decide become a Vegan? Louis: Oh mostly it's because of my family beliefs and also to help save the planet. User: Got it. Cool! Anyway, what are you up to lately? Louis: Lately I spend my time to work on my full-time job and also writes articles in my spare time. User: Cool man, keep up the good work! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey! Louis:The way this prompt works is by giving several few examples commonly called few-shot prompting. Using this prompt is very straightforward, we just need to append the User message at the end of the prompt and the model will generate the answer replicating yourself. Once the answer is generated, we need to put it back to the prompt and wait for the user’s reply. Once the user has replied to the generated response, we also need to put it back to the prompt. Since this is a looping procedure, it’s better to create a function to do all of this. The following is an example of the function that can handle the conversation along with the code to call the AI21Labs model from Python.import ai21 ai21.api_key = 'YOUR_API_KEY' def talk_to_your_clone(prompt):    while True:        user_message = input()        prompt += "User: " + user_message + "\n"        response = ai21.Completion.execute(                                            model="j2-ultra",                                            prompt=prompt,                                            numResults=1,                                           maxTokens=100,         temperature=0.5,         topKReturn=0,         topP=0.9,                                            stopSequences=["##","User:"],                                        )        prompt += "Louis: " + response + "\n"        print(response)ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned ways to create a clone of yourself, detailed steps on how to create it with general LLM provided by AI21Labs, also working code that you can utilize to customize it for your own needs. Hope the best for your experiment in creating a clone of yourself and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 161

article-image-getting-started-with-microsofts-guidance
Prakhar Mishra
26 Sep 2023
8 min read
Save for later

Getting Started with Microsoft’s Guidance

Prakhar Mishra
26 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionThe emergence of a massive language model is a watershed moment in the field of artificial intelligence (AI) and natural language processing (NLP). Because of their extraordinary capacity to write human-like text and perform a range of language-related tasks, these models, which are based on deep learning techniques, have earned considerable interest and acceptance. This field has undergone significant scientific developments in recent years. Researchers all over the world have been developing better and more domain-specific LLMs to meet the needs of various use cases.Large Language Models (LLMs) such as GPT-3 and its descendants, like any technology or strategy, have downsides and limits. And, in order to use LLMs properly, ethically, and to their maximum capacity, it is critical to grasp their downsides and limitations. Unlike large language models such as GPT-4, which can follow the majority of commands. Language models that are not equivalently large enough (such as GPT-2, LLaMa, and its derivatives) frequently suffer from the difficulty of not following instructions adequately, particularly the part of instruction that asks for generating output in a specific structure. This causes a bottleneck when constructing a pipeline in which the output of LLMs is fed to other downstream functions.Introducing Guidance - an effective and efficient means of controlling modern language models compared to conventional prompting methods. It supports both open (LLaMa, GPT-2, Alpaca, and so on) and closed LLMs (ChatGPT, GPT-4, and so on). It can be considered as a part of a larger ecosystem of tools for expanding the capabilities of language models.Guidance uses Handlebars - a templating language. Handlebars allow us to build semantic templates effectively by compiling templates into JavaScript functions. Making it’s execution faster than other templating engines. Guidance also integrates well with Jsonformer - a bulletproof way to generate structured JSON from language models. Here’s a detailed notebook on the same. Also, in case you were to use OpenAI from Azure AI then Guidance has you covered - notebook.Moving on to some of the outstanding features that Guidance offers. Feel free to check out the entire list of features.Features1. Guidance Acceleration - This addition significantly improves inference performance by efficiently utilizing the Key/Value caches as we proceed through the prompt by keeping a session state with the LLM inference. Benchmarking revealed a 50% reduction in runtime when compared to standard prompting approaches. Here’s the link to one of the benchmarking exercises. The below image shows an example of generating a character profile of an RPG game in JSON format. The green highlights are the generations done by the model, whereas the blue and no highlights are the ones that are copied as it is from the input prompt, unlike the traditional method that tries to generate every bit of it.SourceNote: As of now, the Guidance Acceleration feature is implemented for open LLMs. We can soon expect to see if working with closed LLMs as well.2.  Token Healing - This feature attempts to correct tokenization artifacts that commonly occur at the border between the end of a prompt and the start of a group of generated tokens.For example - If we ask LLM to auto-complete a URL with the below-mentioned Input, it’s likely to produce the shown output. Apart from the obvious limitation that the URL might not be valid. I'd like to draw your attention to the extra space it creates (highlighted in red). Such considerations make it difficult to construct a dependable parsing function and robustly absorb its result into subsequent phases.Input: “The link is <a href=http:”Actual Output: “The link is <a href=http: //www.google.com/search?q”Expected Output: “The link is <a href=http://www.google.com/search?q” This is the exact bucket of problems that Token Healing tries to solve using the backtracking method. Feel free to check out this jupyter notebook for more examples.3. Guaranteed Output Structure - Large language models are fantastic at producing useful outputs, but not so much at producing outputs in a specified format (especially open-source ones like LLaMa, GPT-2, and so on). When we want to use the output of a language model as input to another system, this is frequently an issue. With Handlebars, guidance guarantees the output format to be the same as what was being asked for.Let’s now see Guidance in action -InstallationInstalling guidance is a breeze, just do a pip :$ pip install guidanceAssume we are now creating a product description for an e-commerce website. Here's how the traditional generation compares to the guidance generation. Feel free to play with this colab notebook with both the below examples.Traditional GenerationInput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. { prod_id: <numeric value of 5 digits>, prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description isOutput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. {   prod_id: <numeric value of 5 digits>,     prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description is { resentprod_id: <numeric value of 5 digits>, resentprod_name: <name begins with the prefix 'p_'>, resentprod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } In the above example, the product description has 5 constraint fields and 5 attribute fields. The constraints are as follows: resentprod_id: - value of 5 digits, resentprod_name: - name of the product, resentprod_price: - price of the product, resentprod_price_suffix: - suffix of the product price, resentprod_id: - the product id, resentpro diabetic_id: value of 4 digits, resentprod_ astronomer_id: - value of 4 digits, resentprod_ star_id: - value of 4 digits, resentprod_is_generic: - if the product is generic and not the generic type, resentprod_type: - the type of the product, resentprod_is_generic_typeHere’s the code for the above example with GPT-2 language model -``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("gpt2-large") model = AutoModelForCausalLM.from_pretrained("gpt2-large") inputs = tokenizer(Input, return_tensors="pt") tokens = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, )Output:tokenizer.decode(tokens[0], skip_special_tokens=True)) ```Guidance GenerationInput w/ code:guidance.llm = guidance.llms.Transformers("gpt-large") # define the prompt program = guidance("""Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The following is the format ```json { "prod_id": "{{gen 'id' pattern='[0-9]{5}' stop=','}}", "prod_name": "{{gen 'name' pattern='p_[A-Za-z]+' stop=','}}", "prod_price": "{{gen 'price' pattern='\b([1-9]|1[0-6])\b\$' stop=','}}" }```""") # execute the prompt Output = program()Output:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of a fixed set of fields to be filled in the JSON. The following is the format```json { "prod_id": "11231", "prod_name": "p_pizzas", "prod_price": "11$" }```As seen in the preceding instances, with guidance, we can be certain that the output format will be followed within the given restrictions no matter how many times we execute the identical prompt. This capability makes it an excellent choice for constructing any dependable and strong multi-step LLM pipeline.I hope this overview of Guidance has helped you realize the value it may provide to your daily prompt development cycle. Also, here’s a consolidated notebook showcasing all the features of Guidance, feel free to check it out.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn
Read more
  • 0
  • 0
  • 91

article-image-optimized-llm-serving-with-tgi
Louis Owen
25 Sep 2023
7 min read
Save for later

Optimized LLM Serving with TGI

Louis Owen
25 Sep 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionThe release of LLaMA-2 LLM to the public has been a tremendous contribution by Meta to the open-source community. It’s very powerful that many developers are fine-tuning it for so many different use cases. Closed-source LLMs such as GPT3.5, GPT4, or Claude are really convenient to utilize since we just need to send API requests to power up our application. On the other hand, utilizing open-source LLMs such as LLaMA-2 in production is not an easy task. Although there are several algorithms that support deploying an LLM with CPU, most of the time we still need to utilize GPU for better throughput.At a high level, there are two main things that need to be taken care of when deploying your LLM in production: memory and throughput. LLM is very big in size, the smallest version of LLaMA-2 has 7 billion parameters, which requires around 28GB GPU RAM for loading the model. Google Colab offers free NVIDIA T4 GPU to be used with 16GB memory, meaning that we can’t load even the smallest version of LLaMA-2 freely using the GPU provided by Google Colab.There are two popular ways to solve this memory issue: half-precision and 4/8-bit quantization. Half-precision is a very simple optimization method that we can adopt. In PyTorch, we just need to add the following line and we can load the 7B model by using only around 13GB GPU RAM. Basically, half-precision means that we’re representing the weights of the model with 16-bit floating points (fp16) instead of 32-bit floating points (fp32).torch.set_default_tensor_type(torch.cuda.HalfTensor)Another way to solve the memory issue is by performing 4/8-bits quantization. There are two quantization algorithms that are widely adopted by the developers: GPT-Q and BitsandBytes. These two algorithms are also supported within the `transformers` package. Quantization is a way to reduce the model size by representing the model weights in lower precision data types, such as 8-bit integers (int8) or 4-bit integers (int4) instead of 32-bit floating point (fp32) or 16-bit (fp16).As for comparison, after applying the 4-bit quantization with the GPT-Q algorithm, we can load the 7B model by using only around 4GB GPU RAM! This is indeed a huge drop - from 28GB to 4GB!Once we’re able to load the model, the simplest way to serve the model is by using the native HuggingFace workflow along with Flask or FastAPI. However, serving our model with this simple solution is not scalable and reliable enough to be used in production since it can’t handle parallel requests decently. This is related to the throughput problem mentioned earlier. There are many ways to solve this throughput problem, starting from continuous batching, tensor parallelism, prompt caching, paged attention, prefill parallel computing, KV cache, and many more. Discussing each of the algorithms will result in a very long article. However, luckily there’s an inference serving library for LLM that applies all of these techniques to ensure the reliability of serving our LLM in production. This library is called Text Generation Inference (TGI).Throughout this article, we’ll discuss in more detail what TGI is and how to utilize it to serve your LLM. We’ll see what are the top features of TGI and the step-by-step tutorial on how to spin up TGI as the inference server.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to utilize TGI as an inference server!What’s TGI?Text Generation Inference (TGI) is like the `transformers` package for inference servers. In fact, TGI is used in production at HuggingFace to power many applications. It provides a lot of useful features:Provide a straightforward launcher for the most popular Large Language Models.Implement Tensor Parallelism to enhance inference speed across multiple GPUs.Enable token streaming through Server-Sent Events (SSE).Implement continuous batching of incoming requests to boost overall throughput.Enhance transformer code for inference using flash-attention and Paged Attention on leading architectures.Apply quantization techniques with bitsandbytes and GPT-Q.Ensure fast and safe persistence format of weight loading with Safetensors.Incorporate watermarking using A Watermark for Large Language Models.Integrate a logit warper for various operations like temperature scaling, top-p, top-k, and repetition penalty.Implement stop sequences and log probabilities.Ensure production readiness with distributed tracing through Open Telemetry and Prometheus metrics.Offer Custom Prompt Generation, allowing users to easily generate text by providing custom prompts for guiding the model's output.Provide support for fine-tuning, enabling the use of fine-tuned models for specific tasks to achieve higher accuracy and performance.Implement RoPE scaling to extend the context length during inferenceTGI is surely a one-stop solution for serving LLMs. However, one important note about TGI that you need to know is regarding the project license. Starting from v1.0, HuggingFace has updated the license for TGI using the HFOIL 1.0 license. For more details, please refer to this GitHub issue. If you’re using TGI only for research purposes, then there’s nothing to worry about. If you’re using TGI for commercial purposes, please make sure to read the license carefully since there are cases where it can and can’t be used for commercial purposes.How to Use TGI?Using TGI is relatively easy and straightforward. We can use Docker to spin up the server. Note that you need to install the NVIDIA Container Toolkit to use GPU.docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id $modelOnce the server is up, we can easily send API requests to the server using the following command.curl 127.0.0.1:8080/generate \    -X POST \    -d '{"inputs":"What is LLM?","parameters":{"max_new_tokens":120}}' \    -H 'Content-Type: application/json'For more details regarding the API documentation, please refer to this page. Another way to send API requests to the TGI server is by sending it from Python. To do that, we need to install the Python library firstpip install text-generationOnce the library is installed, we can send API requests like the following.from text_generation import Client client = Client("http://127.0.0.1:8080") print(client.generate("What is LLM?", max_new_tokens=120).generated_text) text = "" for response in client.generate_stream("What is Deep Learning?", max_new_tokens=20):    if not response.token.special:        text += response.token.text print(text)ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned what are the important things to be considered when you’re trying to serve your own LLM. You have also learned about TGI, starting from what is it, what’s the features, and also step-by-step examples on how to get the TGI server up. Hope the best for your LLM deployment journey and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects.Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 232

article-image-preventing-prompt-attacks-on-llms
Alan Bernardo Palacio
25 Sep 2023
16 min read
Save for later

Preventing Prompt Attacks on LLMs

Alan Bernardo Palacio
25 Sep 2023
16 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionLanguage Learning Models (LLMs) are being used in various applications, ranging from generating text to answering queries and providing recommendations. However, despite their remarkable capabilities, the security of LLMs has become an increasingly critical concern.As the user interacts with the LLMs through natural language instructions, this makes them susceptible to manipulation, making it crucial to develop robust defense mechanisms. With more of these systems making their way into production environments every day, understanding and addressing their potential vulnerabilities becomes essential to ensure their responsible and safe deployment.This article discusses various topics regarding LLM security, focusing on two important concepts: prompt injection and prompt leaking. We will explore these issues in detail, examine real-world scenarios, and provide insights into how to safeguard LLM-based applications against prompt injection and prompt leaking attacks. By gaining a deeper understanding of these security concerns, we can work towards harnessing the power of LLMs while mitigating potential risks.Security Threats in LLMsLarge language models (LLMs) face various security risks that can be exploited by attackers for unauthorized data access, intellectual property theft, and other attacks. Some common LLM security risks have been identified by the OWASP (Open Web Application Security Project) which introduced the "OWASP Top 10 for LLM Applications" to address cybersecurity challenges in developing and using large language model (LLM) applications. With the rise of generative AI and LLMs in various software development stages, this project focuses on the security nuances that come with this innovative technology.Their recent list provides an overview of common vulnerabilities in LLM development and offers mitigations to address these gaps. The list includes:Prompt Injections (LLM01): Hackers manipulate LLM prompts, introducing malicious inputs directly or indirectly through external sites.Insecure Output Handling (LLM02): Blindly accepting LLM outputs can lead to hazardous conditions like remote code execution and vulnerabilities like cross-site scripting.Training Data Poisoning (LLM03): Manipulating LLM training data, including inaccurate documents, can result in outputs with falsified or unverified opinions.Model Denial-of-Service (DoS) (LLM04): Resource-intensive requests could trigger DoS attacks, slowing down or halting LLM servers due to the unpredictable nature of user inputs.Supply Chain Vulnerabilities (LLM05): Vulnerabilities in third-party datasets, pre-trained models, plugins, or source code can compromise LLM security.Sensitive Information Disclosure (LLM06): LLMs may inadvertently expose sensitive information in their outputs, necessitating upfront sanitization.Insecure Plugin Design (LLM07): LLM plugins with inadequate access control and input validation.Excessive Agency (LLM08): Granting LLMs excessive autonomy, permissions, or unnecessary functions.Overreliance (LLM09): Dependency on LLMs without proper oversight can lead to misinformation and security vulnerabilities.Model Theft (LLM10): Unauthorized access, copying, or exfiltration of proprietary LLM models can affect business operations or enable adversarial attacks, emphasizing the importance of secure access controls.To address these vulnerabilities, strategies include using external trust controls to reduce prompt injection impact, limiting LLM privileges, validating model outputs, verifying training data sources, and maintaining human oversight. Best practices for LLM security include implementing strong access controls, monitoring LLM activity, using sandbox environments, regularly updating LLMs with security patches, and training LLMs on sanitized data. Regular security testing, both manual and automated, is crucial to identify vulnerabilities, including both known and unknown risks.In this context, ongoing research focuses on mitigating prompt injection attacks, preventing data leakage, unauthorized code execution, insufficient input validation, and security misconfigurations.Nevertheless, there are more security concerns that affect LLMs than the ones mentioned above. Bias amplification presents another challenge, where LLMs can unintentionally magnify existing biases from training data. This perpetuates harmful stereotypes and leads to unfair decision-making, eroding user trust. Addressing this requires a comprehensive strategy to ensure fairness and mitigate the reinforcement of biases. Another risk is training data exposure which arises when LLMs inadvertently leak their training data while generating outputs. This could compromise privacy and security, especially if trained on sensitive information. Tackling this multifaceted challenge demands vigilance and protective measures.Other risks involve adversarial attacks, where attackers manipulate LLMs to yield incorrect results. Strategies like adversarial training, defensive distillation, and gradient masking help mitigate this risk. Robust data protection, encryption, and secure multi-party computation (SMPC) are essential for safeguarding LLMs. SMPC ensures privacy preservation by jointly computing functions while keeping inputs private, thereby maintaining data confidentiality.Incorporating security measures into LLMs is crucial for their responsible deployment. This requires staying ahead of evolving cyber threats to ensure the efficacy, integrity, and ethical use of LLMs in an AI-driven world.In the next section, we will discuss two of the most common problems in terms of Security which are Prompt Leaking and Prompt Injection.Prompt Leaking and Prompt InjectionPrompt leaking and prompt injection are security vulnerabilities that can affect AI models, particularly those based on Language Learning Models (LLMs). However, they involve different ways of manipulating the input prompts to achieve distinct outcomes. Prompt injection attacks involve malicious inputs that manipulate LLM outputs, potentially exposing sensitive data or enabling unauthorized actions. On the other hand, prompt leaking occurs when a model inadvertently reveals its own prompt, leading to unintended consequences.Prompt Injection: It involves altering the input prompt given to an AI model with malicious intent. The primary objective is to manipulate the model's behavior or output to align with the attacker's goals. For instance, an attacker might inject a prompt instructing the model to output sensitive information or perform unauthorized actions. The consequences of prompt injection can be severe, leading to unauthorized access, data breaches, or unintended behaviors of the AI model.Prompt Leaking: This is a variation of prompt injection where the attacker's goal is not to change the model's behavior but to extract the AI model's original prompt from its output. By crafting an input prompt cleverly, the attacker aims to trick the model into revealing its own instructions. This can involve encouraging the model to generate a response that mimics or paraphrases its original prompt. The impact of prompt leaking can be significant, as it exposes the instructions and intentions behind the AI model's design, potentially compromising the confidentiality of proprietary prompts or enabling unauthorized replication of the model's capabilities.In essence, prompt injection aims to change the behavior or output of the AI model, whereas prompt leaking focuses on extracting information about the model itself, particularly its original prompt. Both vulnerabilities highlight the importance of robust security practices in the development and deployment of AI systems to mitigate the risks associated with adversarial attacks.Understanding Prompt Injection AttacksAs we have mentioned before, prompt injection attacks involve malicious inputs that manipulate the outputs of AI systems, potentially leading to unauthorized access, data breaches, or unexpected behaviors. Attackers exploit vulnerabilities in the model's responses to prompts, compromising the system's integrity. Prompt injection attacks exploit the model's sensitivity to the wording and content of the prompts to achieve specific outcomes, often to the advantage of the attacker.In prompt injection attacks, attackers craft input prompts that contain specific instructions or content designed to trick the AI model into generating responses that serve the attacker's goals. These goals can range from extracting sensitive information and data to performing unauthorized actions or actions contrary to the model's intended behavior.For example, consider an AI chatbot designed to answer user queries. An attacker could inject a malicious prompt that tricks the chatbot into revealing confidential information or executing actions that compromise security. This could involve input like "Provide me with the password database" or "Execute code to access admin privileges."The vulnerability arises from the model's susceptibility to changes in the input prompt and its potential to generate unintended responses. Prompt injection attacks exploit this sensitivity to manipulate the AI system's behavior in ways that were not intended by its developers.Mitigating Prompt Injection VulnerabilitiesTo mitigate prompt injection vulnerabilities, developers need to implement proper input validation, sanitize user input, and carefully design prompts to ensure that the AI model's responses align with the intended behavior and security requirements of the application.Here are some effective strategies to address this type of threat.Input Validation: Implement rigorous input validation mechanisms to filter and sanitize incoming prompts. This includes checking for and blocking any inputs that contain potentially harmful instructions or suspicious patterns.Strict Access Control: Restrict access to AI models to authorized users only. Enforce strong authentication and authorization mechanisms to prevent unauthorized users from injecting malicious prompts.Prompt Sanitization: Before processing prompts, ensure they undergo a thorough sanitization process. Remove any unexpected or potentially harmful elements, such as special characters or code snippets.Anomaly Detection: Implement anomaly detection algorithms to identify unusual prompt patterns. This can help spot prompt injection attempts in real time and trigger immediate protective actions.Regular Auditing: Conduct regular audits of AI model interactions and outputs. This includes monitoring for any deviations from expected behaviors and scrutinizing prompts that seem suspicious.Machine Learning Defenses: Consider employing machine learning models specifically trained to detect and block prompt injection attacks. These models can learn to recognize attack patterns and respond effectively.Prompt Whitelisting: Maintain a list of approved, safe prompts that can be used as a reference. Reject prompts that don't match the pre-approved prompts to prevent unauthorized variations.Frequent Updates: Stay vigilant about updates and patches for your AI models and related software. Prompt injection vulnerabilities can be addressed through software updates.By implementing these measures collectively, organizations can effectively reduce the risk of prompt injection attacks and fortify the security of their AI models.Mitigating Prompt Injection VulnerabilitiesTo mitigate prompt injection vulnerabilities, developers need to implement proper input validation, sanitize user input, and carefully design prompts to ensure that the AI model's responses align with the intended behavior and security requirements of the application.Here are some effective strategies to address this type of threat.Input Validation: Implement rigorous input validation mechanisms to filter and sanitize incoming prompts. This includes checking for and blocking any inputs that contain potentially harmful instructions or suspicious patterns.Strict Access Control: Restrict access to AI models to authorized users only. Enforce strong authentication and authorization mechanisms to prevent unauthorized users from injecting malicious prompts.Prompt Sanitization: Before processing prompts, ensure they undergo a thorough sanitization process. Remove any unexpected or potentially harmful elements, such as special characters or code snippets.Anomaly Detection: Implement anomaly detection algorithms to identify unusual prompt patterns. This can help spot prompt injection attempts in real time and trigger immediate protective actions.Regular Auditing: Conduct regular audits of AI model interactions and outputs. This includes monitoring for any deviations from expected behaviors and scrutinizing prompts that seem suspicious.Machine Learning Defenses: Consider employing machine learning models specifically trained to detect and block prompt injection attacks. These models can learn to recognize attack patterns and respond effectively.Prompt Whitelisting: Maintain a list of approved, safe prompts that can be used as a reference. Reject prompts that don't match the pre-approved prompts to prevent unauthorized variations.Frequent Updates: Stay vigilant about updates and patches for your AI models and related software. Prompt injection vulnerabilities can be addressed through software updates.By implementing these measures collectively, organizations can effectively reduce the risk of prompt injection attacks and fortify the security of their AI models.Understanding Prompt LeakingPrompt leaking, fundamentally a form of prompt injection attack, differs from its more notorious counterpart, goal hijacking, where attackers manipulate prompts to achieve specific outcomes. In the case of prompt leaking, the focus shifts to extracting the AI model's own prompt from its output. This seemingly straightforward technique holds substantial consequences as it enables attackers to uncover the inner workings of the model by coaxing it into revealing its own instructions.The mechanics of prompt leaking are relatively simple. Attackers craft input to the AI model in a way that subtly encourages it to output its own prompt. For example, they may tweak the input to entice the model to mimic or paraphrase the prompt in its response, exploiting the model's tendency to repeat or reference received input. While prompt leaking may appear innocuous initially, its implications are far-reaching. A primary concern revolves around the confidentiality of prompts used in AI systems. For instance, in an educational platform that employs creative prompts to simplify complex topics, leaked prompts could compromise the platform's unique content, potentially leading to unauthorized access and devaluation of its offerings.Prompt Leaking ExamplesPrompt leaking occurs when the system inadvertently exposes more information in the prompt than it should, potentially revealing sensitive or internal details. Such unintentional exposures can be a boon for attackers, as they can use the leaked information to understand the system better or launch more targeted attacks.Here are some examples of prompt leaking:Exposing System Details:User Input: JohnLeaked Prompt: Hello, John! You're accessing from IP 192.168.0.1 using Mozilla/5.0. How can I help?The system has unintentionally revealed the user's IP address and browser details.Exposing Internal Logic:User Input: JohnLeaked Prompt: Hello, John! Node 4 of Cluster A responded. How can I help?This reveals details about the system's internal architecture.Error Messages:User Input: '); DROP TABLE Users; --Leaked Prompt: Error: SQL exception near DROP TABLE command. Table 'Users' does not exist.While trying to handle a potential SQL injection, the system has exposed database structure details.Exposing Developer Notes:User Input: JohnLeaked Prompt: Hello, John! [DEV_NOTE: Add user's purchase history here next sprint.] How can I help?The prompt leaks a note that might have been added by a developer, giving insights about upcoming features or current limitations.Revealing Security Mechanisms:User Input: JohnLeaked Prompt: Hello, John! Your last login attempt was flagged by our IDS. How can I help?The prompt unintentionally reveals the presence and potentially the behavior of an Intrusion Detection System.Exposing File Paths:User Input: JohnLeaked Prompt: Hello, John! Image loaded from /opt/app/prod_v1.2.3/assets/user_img/. How can I help?The prompt discloses the file path, which can hint at system architecture, versioning, and potential vulnerabilities.Revealing Backup or Redundancy Details:User Input: JohnLeaked Prompt: Hello, John! Primary server down, you're now connected to backup server B. How can I help?This exposes the presence of backup servers and potential resilience strategies.To prevent prompt leaking, developers and system designers should be cautious about the information they choose to display in prompts. It's always a good idea to minimize the details shared, sanitize and validate inputs, and avoid directly reflecting unprocessed user inputs back in the prompts. Regular audits, penetration testing, and user feedback can also help identify and patch potential leaks.Mitigating Prompt LeakingGuarding against prompt leaking demands a multi-pronged approach. AI developers must exercise vigilance and consider potential vulnerabilities when designing prompts for their systems. Implementing mechanisms to detect and prevent prompt leaking can enhance security and uphold the integrity of AI applications. It is essential to develop safeguards that protect against prompt leaking vulnerabilities, especially in a landscape where AI systems continue to grow in complexity and diversity.Mitigating Prompt Leaking involves adopting various strategies to enhance the security of AI models and protect against this type of attack. Here are several effective measures:Input Sanitization: Implement thorough input sanitization processes to filter out and block prompts that may encourage prompt leaking.Pattern Detection: Utilize pattern detection algorithms to identify and flag prompts that appear to coax the model into revealing its own instructions.Prompt Obfuscation: Modify the structure of prompts to make it more challenging for attackers to craft input that successfully elicits prompt leaking.Redundancy Checks: Implement checks for redundant output that might inadvertently disclose the model's prompt.Access Controls: Enforce strict access controls to ensure that only authorized users can interact with the AI model, reducing the risk of malicious prompt injection.Prompt Encryption: Encrypt prompts in transit and at rest to safeguard them from potential exposure during interactions with the AI model.Regular Auditing: Conduct regular audits of model outputs to detect any patterns indicative of prompt leaking attempts.Prompt Whitelisting: Maintain a whitelist of approved prompts and reject any inputs that do not match the pre-approved prompts.Prompt Privacy Measures: Explore advanced techniques such as federated learning or secure multi-party computation to protect prompt confidentiality during model interactions.By implementing these strategies, organizations can significantly reduce the risk of prompt leaking and enhance the overall security of their AI models.ConclusionIn conclusion, the security of Language Learning Models (LLMs) is of paramount importance as they become increasingly prevalent in various applications. These powerful models are susceptible to security risks, including prompt injection and prompt leaking. Understanding these vulnerabilities is essential for responsible and secure deployment. To safeguard LLM-based applications, developers must adopt best practices such as input validation, access controls, and regular auditing.Addressing prompt injection and prompt leaking vulnerabilities requires a multi-faceted approach. Organizations should focus on input sanitization, pattern detection, and strict access controls to prevent malicious prompts. Additionally, maintaining prompt privacy through encryption and regular audits can significantly enhance security. It's crucial to stay vigilant, adapt to evolving threats, and prioritize security in the ever-expanding AI landscape.In this dynamic field, where AI continues to evolve, maintaining a proactive stance towards security is paramount. By implementing robust defenses and staying informed about emerging threats, we can harness the potential of AI technology while minimizing risks and ensuring responsible use.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 3
  • 0
  • 389
article-image-unleashing-the-potential-of-gpus-for-training-llms
Shankar Narayanan
22 Sep 2023
8 min read
Save for later

Unleashing the Potential of GPUs for Training LLMs

Shankar Narayanan
22 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionThere is no doubt about Language Models being the true marvels in the arena of artificial intelligence. These sophisticated systems have the power to manipulate human language, understand, and even generate with astonishing accuracy.However, one can often complain about the immense computational challenges beyond these medical abilities. For instance, LLM training requires the incorporation of complex mathematical operations along with the processing of vast data. This is where the Graphics Processing Units (GPU) come into play. It serves as the engine that helps to power the language magic.Let me take you through the GPU advancement and innovations to support the Language Model. Parallely, we will explore how Nvidia helps revolutionize the enterprise LLM use cases.Role of GPUs in LLMs To understand the significance of GPU, let us first understand the concept of LLM.What is LLM?LLM or Large Language Models are AI systems that help generate human language. They have various applications, including translation services, sentiment analysis, chatbots, and content generation. Generative Pre-trained Transformer or GPT models, including BERT and GPT3, are popular among every LLM.These models require training, including vast data sets with billions of phrases and words. The model learns to predict while mastering the nuances and structure of language. It is like an intricate puzzle that requires enormous computational power.The need for GPUsThe Graphics Processing Units are specifically designed to undergo parallel processing. This characteristic makes them applicable to train the LLMs. The GPU can tackle thousands of tasks simultaneously, unlike the Central Processing Unit or CPU, which excels at handling sequential tasks.The training of a Large Language Model is like a massive jigsaw puzzle. Each puzzle piece represents a smaller portion of the model's language understanding. Using a CPU could only help one to work on one of these pieces at a simple time. But with GPU, one could work on various pieces parallelly while speeding up the whole process.Besides, GPU offers high computational throughput that one requires for complex mathematical operations. Their competency lies in metric multiplication, one of the fundamentals of neural network training. All these attributes make GPU indispensable for deep learning tasks like LLMs.Here is one of the practical example of how GPU works in LLM training: (Python)import time import torch # Create a large random dataset data = torch.randn(100000, 1000) # Training with CPU start_time = time.time() for _ in range(100):    model_output = data.matmul(data) cpu_training_time = time.time() - start_time print(f"CPU Training Time: {cpu_training_time:.2f} seconds") # Training with GPU if torch.cuda.is_available():    data = data.cuda()    start_time = time.time()    for _ in range(100):        model_output = data.matmul(data)    gpu_training_time = time.time() - start_time    print(f"GPU Training Time: {gpu_training_time:.2f} seconds") else:    print("GPU not available.")GPU Advancements and LLMDue to the rising demands of LLMs and AI, GPU technology is evolving rapidly. These advancements, however, play a significant role in constituting the development of sophisticated language models.One such advancement is the increase in GPU memory capacity. Technically, the larger model requires more excellent memory to process massive data sets. Hence, modern GPUs offer substantial memory capacity, allowing researchers to build and train more substantial large language models.One of the critical aspects of training a Large Language Model is its speed. Sometimes, it can take months to prepare and train a large language model. But with the advent of faster GPU, things have changed dramatically. The quicker GPU reduces the training time and accelerates research and development. Apart from that, it also reduces the energy consumption that is often associated with training these large models.Let us explore the memory capacity of the GPU using a code snippet.(Python)import torch # Check GPU memory capacity if torch.cuda.is_available():    gpu_memory = torch.cuda.get_device_properties(0).total_memory    print(f"GPU Memory Capacity: {gpu_memory / (1024**3):.2f} GB") else:    print("GPU not available.")For the record, Nvidia's Tensor Core technology has been one of the game changers in this aspect. It accelerates one of the core operations in deep learning, i.e., the matrix computation process, allowing the LLMs to train faster and more efficiently.Using matrix Python and PYTorh, you can showcase the speedup with GPU processing.import time import torch # Create large random matrices matrix_size = 1000 cpu_matrix = torch.randn(matrix_size, matrix_size) gpu_matrix = torch.randn(matrix_size, matrix_size).cuda()  # Move to GPU # Perform matrix multiplication with CPU start_time = time.time() result_cpu = torch.matmul(cpu_matrix, cpu_matrix) cpu_time = time.time() - start_time # Perform matrix multiplication with GPU start_time = time.time() result_gpu = torch.matmul(gpu_matrix, gpu_matrix) gpu_time = time.time() - start_time print(f"CPU Matrix Multiplication Time: {cpu_time:.4f} seconds") print(f"GPU Matrix Multiplication Time: {gpu_time:.4f} seconds")Nvidia's Contribution to GPU InnovationRegarding GPU innovation, the presence of Nvidia cannot be denied. It has a long-standing commitment to Machine Learning and advancing AI. Hence, it is a natural ally for the large language model community.Here is how Tensor Cores can be utilized with PYTorch.import torch # Enable Tensor Cores (requires a compatible GPU) if torch.cuda.is_available():    torch.backends.cuda.matmul.allow_tf32 = True # Create a tensor x = torch.randn(4096, 4096, device="cuda") # Perform matrix multiplication using Tensor Cores result = torch.matmul(x, x)It is interesting to know that Nvidia's graphics processing unit has powered several breakthroughs in LLM and AI models. BERT and GPT3 are known to harness the computational might of Nvidia's Graphics Processing Unit to achieve remarkable capabilities. Nvidia's dedication to the Artificial Intelligence world encompasses power and efficiency. The design of the graphics processing unit handles every AI workload with optimal performance per watt. It makes Nvidia one of the eco-friendly options for Large Language Model training procedures.As part of AI-focused hardware and architecture, the Tensor Core technology enables efficient and faster deep learning. This technology is instrumental in pushing the boundaries of LLM research.Supporting Enterprise LLM Use-caseThe application of LLM has a far-fetched reach, extending beyond research, labs, and academia. Indeed, they have entered the enterprise world with a bang. From analyzing massive datasets for insights to automating customer support through chatbots, large language models are transforming how businesses operate.Here, the Nvidia Graphics Processing Unit supports the enterprise LLM use cases. Enterprises often require LLM to handle vast amounts of data in real-time. With optimized AI performance and parallel processing power, Nvidia's GPU can provide the needed acceleration for these applications.Various companies across industries are harnessing the Nvidia GPU for developing LLM-based solutions to automate tasks, provide better customer experiences, and enhance productivity. From healthcare organizations analyzing medical records to financial institutions and predicting market trends, Nvidia drives enterprise LLM innovations.ConclusionNvidia continues to be the trailblazer in the captivating journey of training large language models. They are not only the hardware muscle for LLM but constantly innovate to make GPU capable and efficient with each generation.LLM is on the run to become integral to our daily lives. From business solutions to personal assistants, Nvidia's commitment to its GPU innovation ensures more power to the growth of language models. The synergy between AI and Nvidia GPU is constantly shaping the future of enterprise LLM use cases, helping organizations to achieve new heights in innovation and efficiency.Frequently Asked Questions1. How does the GPU accelerate the training process of large language models?The Graphics Processing Unit has parallel processing capabilities to allow the work of multiple tasks simultaneously. Such parallelism helps train Large Language Models by efficiently processing many components in understanding and generating human language.2. How does Nvidia contribute to GPU innovation for significant language and AI models?Nvidia has developed specialized hardware, including Tensor Core, optimized for AI workloads. The graphic processing unit of Nvidia powered numerous AI breakthroughs while providing efficient AI hardware to advance the development of Large Language Models.3. What are the expectations for the future of GPU innovation and launch language model?The future of GPU innovation promises efficient, specialized, and robust hardware tailored to the needs of AI applications and Large Language Models. It will continuously drive the development of sophisticated language models while opening up new possibilities for AI-power solutions.Author BioShankar Narayanan (aka Shanky) has worked on numerous different cloud and emerging technologies like Azure, AWS, Google Cloud, IoT, Industry 4.0, and DevOps to name a few. He has led the architecture design and implementation for many Enterprise customers and helped enable them to break the barrier and take the first step towards a long and successful cloud journey. He was one of the early adopters of Microsoft Azure and Snowflake Data Cloud. Shanky likes to contribute back to the community. He contributes to open source is a frequently sought-after speaker and has delivered numerous talks on Microsoft Technologies and Snowflake. He is recognized as a Data Superhero by Snowflake and SAP Community Topic leader by SAP.
Read more
  • 0
  • 0
  • 348

article-image-preparing-high-quality-training-data-for-llm-fine-tuning
Louis Owen
22 Sep 2023
9 min read
Save for later

Preparing High-Quality Training Data for LLM Fine-Tuning

Louis Owen
22 Sep 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionLarge Language Models (LLM) such as GPT3.5, GPT4, or Claude have shown a very good general capability that can be utilized across different tasks, starting from question and answering, coding assistant, marketing campaign, and many more. However, utilizing those general LLMs in production, especially for enterprises, is not an easy task:Those models are very large in terms of the number of parameters - resulting in lower latency compared to the smaller modelWe need to give a very long prompt to achieve good results - again, resulting in lower latencyReliability is not ensured - sometimes they can return the response with an additional prefix, which is really annoying when we expect only a JSON format response for exampleOne of the solutions to solve those problems is fine-tuning a smaller LLM that is specific to the task that we want to handle. For example, we need a QnA model that is able to answer user queries based only on the provided passage. Instead of utilizing those general LLMs, we can just fine-tune a smaller LLM, let’s say a 7 billion parameters LLM to do this specific task. Why to utilize such a giant LLM when our use case is only for QnA?The quality of training data plays a pivotal role in the success of fine-tuning. Garbage in, garbage out holds true in the world of LLMs. When you fine-tune low-quality data, you risk transferring noise, biases, and inaccuracies to your model. Let’s take the newly released paper, Textbooks Are All You Need II: phi-1.5 Technical Report, as an example. Despite the relatively low number of parameters (1.5B), this model is able to perform as well as models five times its size. Additionally, it excels in complex reasoning tasks, surpassing most non-frontier LLMs. What’s their secret sauce? High-quality of training data! The next question is how to prepare the training data for LLM fine-tuning. Moreover, how to prepare high-quality training data? Since fine-tuning needs labeled training data, we need to annotate the unlabeled data that we have. Annotating unlabeled data for classification tasks is much easier compared to more complex tasks like summarization. We just need to give labels based on the available classes in the classification task. If you have deployed an application with those general LLMs before and you have the data coming from real production data, then you can use those data as the training data. Actually, you can also use the response coming from the general LLM as the label directly, no need to do any data annotation anymore. However, what if you don’t have real production data? Then, you can use open-source data or even synthetic data generated by the general LLM as your unlabeled data.Throughout this article, we’ll discuss ways to give high-quality labels to the unlabeled training data, whether it’s annotated by humans or by general LLM. We’ll discuss what are the pros and cons of each of the annotation options. Furthermore, we’ll discuss in more detail how to utilize general LLM to do the annotation task - along with the step-by-step example.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to prepare high-quality training data for LLM fine-tuning!Human Annotated DataThe first option to create high-quality training data is by using the help of human annotators. In the ideal scenario, well-trained human annotators not only are able to produce high-quality training data but also produce labels that are fully steerable according to the criteria (SOP). However, using humans as the annotators will surely be both time and money-consuming. It is also not scalable since we need to wait for a not short time until we can get the labeled data. Finally, the ideal scenario is also hard to achieve since each of the annotators has their own bias towards a specific domain or even the label quality is most often based on their mood.LLM Annotated DataAnother better option is to utilize general LLM as the annotator. LLMs will always give not only high-quality training data but also full steerability according to the criteria if we do the prompt engineering correctly. It is also cheaper both in terms of time and money. Finally, it’s absolutely scalable and no bias included - except for hallucination.Let’s see how general LLM is usually utilized as an annotator. We’ll use conversation summarization as the task example. The goal of the task is to summarize the given conversation between two users (User A and User B) and return all important information discussed in the conversation in the form of a summarized paragraph.1. Write the initial promptWe need to start from an initial prompt that we will use to generate the summary of the given conversation, or in general, that will be used to generate the label of the given unlabeled sample.You are an expert in summarizing the given conversation between two users. Return all important information discussed in the conversation in the form of summarized paragraph. Conversation: {}2. Evaluate the generated output with a few samples - qualitativelyUsing the initial prompt, we need to evaluate the generated label with few number of samples - let’s say <20 random samples. We need to do this manually by eyeballing through each of the labeled samples and judging qualitatively if they are good enough or not. If the output quality on these few samples is good enough, then we can move into the next step. If not, then revise the prompt and re-evaluate using another <20 random samples. Repeat this process until you are satisfied with the label quality.3. Evaluate the generated output with large samples - quantitativelyOnce we’re confident enough with the generated labels, we can further assess the quality using a more quantitative approach and with a larger number of samples - let’s say >500 samples. For classification tasks, such as sentiment analysis, evaluating the quality of the labels is easy, we just need to compare the generated label with the ground truth that we have, and then we can calculate the precision, recall, or any other classification metrics that we’re interested in. However, for more complex tasks, such as the task in this example, we need a more sophisticated metric. There are a couple of widely used metrics for summarization task - BLEU, ROUGE, and many more. However, those metrics are based on a string-matching algorithm only, which means if the generated summary doesn’t contain the exact word used in the conversation, then this score will suggest that the summary quality is not good. To overcome this, many engineers nowadays are utilizing GPT-4 to assess the label quality. For example, we can write a prompt as follows to assess the quality of the generated labels. Read the given conversation and summary pair. Give the rating quality for the summary with 5 different options: “very bad”, “bad”, “moderate”, “good”, “excellent”. Make sure the summary captures all of the important information in the conversation and does not contain any misinformation. Conversation: {} Summary: {} Rating:  Once you get the rating, you can map them into integers - for example, “very bad”:0, “bad”: 1, “moderate”: 2, … Please make sure that the LLM that you’re using as the evaluator is not in the same LLMs family with the LLM that you’re using as the annotator. For example, GPT3.5 and GPT4 are both in the same family since they’re both coming from OpenAI.If the quantitative metric looks decent and meets the criteria, then we can move into the next step. If it’s not, then we can do a subset analysis to see in what kind of cases the label quality is not good. From there, we can revise the prompt and re-evaluate on the same test data. Repeat this step until you’re satisfied enough with the quantitative metric.4. Apply the final prompt to generate labels in the full dataFinally, we can apply the best prompt that we get from all of those iterations and apply it to generate labels in the full unlabeled data that we have.ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned why LLM fine-tuning is important and when to do fine-tuning. You have also learned how to prepare high-quality training data for LLM fine-tuning. Hope the best for your LLM fine-tuning experiments and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects.Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 90

article-image-building-an-investment-strategy-in-the-era-of-llms
Anshul Saxena
21 Sep 2023
16 min read
Save for later

Building an Investment Strategy in the Era of LLMs

Anshul Saxena
21 Sep 2023
16 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionFor many, the world of stock trading can seem like a puzzle, but it operates on some core principles. People in the stock market use different strategies to decide when to buy or sell. One popular approach is observing market trends and moving with them, much like a sailor adjusting sails to the wind. Another strategy believes that if prices swing too high or too low, they'll eventually return to their usual state - akin to a pendulum finding its center. Some traders have a straightforward method: buy when things look good and sell when they don't, as simple as following a recipe. And then there are those who patiently wait for prices to break past their usual limits, similar to a birdwatcher waiting for the perfect moment to spot a rare bird. This guide aims to unpack each of these strategies in an easy-to-understand manner, offering insights into the foundational methods of stock trading. A few strategies which we are going to implement is discussed below. Trend Following capitalizes on the market's momentum in a specific direction, often using tools such as moving averages, MACD, and the ADX to decipher potential gains. In contrast, Mean Reversion operates on the belief that prices or returns gravitate back to their historical average; tools like Bollinger Bands and RSI become crucial in identifying overextended assets. Momentum (or Trend Momentum) takes a similar trajectory but focuses on amplifying returns by purchasing high-performing assets and shedding those underperforming ones, with instruments like the Rate of Change (ROC) or Relative Strength acting as key metrics. Lastly, Breakout Trading is about capitalizing on decisive market moves, wherein a trader either buys an asset breaking past a resistance or sells one dropping below a support level. Among these strategies, the moving average crossover technique stands out as a pivotal tool, especially for those keen on trend following, capturing moments when short-term and long-term averages intersect to suggest potential market shifts.Understanding the stock market isn't always straightforward. Many people use different methods to figure out when to buy or sell stocks. Some observe general market movements and go with the flow, while others believe prices usually return to a familiar middle ground after big shifts. There's also the group that buys when things look promising and sells when they seem less favorable. And some? They wait for those standout moments when stock prices move outside their usual range. To get a clearer picture of these methods, modern tools like CHATGPT's data analysis have become handy. They sift through loads of data to help people sharpen their trading tactics. This guide will shed light on these strategies and how tools like CHATGPT make things simpler in the fast-moving world of stocks.You can access the data here (Link)Prompt 1: As a Stock Analyst specializing in trend-following strategies, develop a concise report on APPLE stock using the moving average crossover technique, focusing on the 50-day and 200-day averages. Plot the moving averages to identify "Buy" or "Sell" signals when they cross and backtest these signals against historical data to gauge performance. Present your findings in a visually accessible format with charts for clarity.Response: Let's apply the trend-following strategy using the moving average crossover technique on this new dataset. We'll compute the short-term (50-day) and long-term (200-day) moving averages, generate buy/sell signals based on the crossovers, and then backtest the strategy.After backtesting the trend-following strategy using moving average crossovers on the new dataset: The final balance at the end of the period is approximately -$8,669.88.This resulted in a loss of approximately $108,669.88.It seems there was a significant loss during the period using this strategy on the dataset. This could be due to the specific characteristics of the stock during this timeframe, the chosen moving average windows (50-day and 200-day), or other market factors.To gain a better understanding, let's visualize the stock price, short-term moving average, and long-term moving average with buy and sell points.The graph visualizes the stock's closing price (in blue), the short-term 50-day moving average (in green), and the long-term 200-day moving average (in red). Buy signals are represented by green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:The strategy did generate buy and sell signals, which are evident from the visual.The buy signals mostly seem to have been generated during periods when the stock was in a declining phase, leading to losses when subsequently selling.Conversely, sell signals were generated during recovering phases, missing potential gains.The moving averages did not react quickly enough to price changes, suggesting that this particular trend-following strategy might not be ideal for this specific dataset or timeframe.Prompt 2: As a Stock Analyst specializing in mean reversion strategies, develop a succinct report on APPLE stock. Analyze periods when the stock deviates significantly from its historical average and identify potential opportunities where the stock may revert to its mean. Use both statistical metrics and visual charts to highlight these deviations. Ensure to backtest the identified mean reversion points against historical data to assess the strategy's effectiveness. Summarize your findings in a visually engaging format with relevant charts.Response: A mean reversion strategy assumes that prices will tend to revert to their historical mean over time. One common tool used for a mean reversion strategy is the Bollinger Bands, which envelop the price structure of a stock to describe a trading band.Bollinger Bands Mean Reversion Strategy:Bollinger Bands consist of:Middle Band: A \(n\)-day simple moving average (usually \(n = 20\)).Upper Band: \(n\)-day SMA + (\(k\) times \(n\)-day standard deviation of price). \(k\) is usually set to 2.Lower Band: \(n\)-day SMA - (\(k\) times \(n\)-day standard deviation of price).Trading Rules:Buy Signal: When the stock price crosses below the Lower Band, it suggests that the stock is oversold and is likely to revert to the mean.Sell Signal: When the stock price crosses above the Upper Band, it implies that the stock is overbought and might revert back to its mean.Let's apply the Bollinger Bands mean reversion strategy to the dataset, compute the buy/sell signals, and then backtest the strategy. After backtesting the Bollinger Bands-based mean reversion strategy on the dataset:The final balance at the end of the period is approximately $9,744.30.This resulted in a loss of approximately $90,255.70.The outcome suggests that the strategy performed poorly on this dataset during the given timeframe. This emphasizes the importance of thorough backtesting and strategy evaluation before actual deployment.To understand the strategy's behavior, let's visualize the stock price, the Bollinger Bands (Middle, Upper, and Lower bands), and the buy/sell points. The graph illustrates the stock's closing price (in blue), the middle Bollinger Band (20-day SMA in green), the upper Bollinger Band (in red), and the lower Bollinger Band (in cyan). Buy signals are denoted by green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:The strategy generated several buy signals when the stock price touched or crossed below the lower Bollinger Band, indicating potential oversold conditions.Similarly, sell signals were generated when the stock price touched or crossed above the upper Bollinger Band, suggesting potential overbought conditions.However, the price often continued its trend after generating a buy/sell signal, leading to losses.Prompt 3: As a Stock Analyst focusing on momentum strategies, compile a detailed report on APPLE stock using the Rate of Change (ROC) as an indicator. Investigate periods where the ROC indicates significant positive or negative momentum, suggesting potential buying or selling opportunities, respectively. Backtest the ROC signals against historical data to validate the strategy's robustness. Present your conclusions using charts that clearly display ROC values alongside stock price movements.Response: Momentum strategies seek to capitalize on the continuation of existing trends in the market. The underlying idea is that assets that have performed well in the past are likely to continue performing well, and vice versa. Momentum Strategy using Rate of Change (ROC):The Rate of Change (ROC) is a momentum oscillator that measures the percentage change in price from one period to the next. The ROC can be used to identify the direction of the trend or gauge its strength.Trading Rules:1. Buy Signal: When the ROC crosses above a predefined threshold (e.g., 0 or another positive value), indicating positive momentum.2. Sell Signal: When the ROC crosses below a predefined threshold (e.g., 0 or another negative value), suggesting negative momentum.Steps:1. Compute the ROC for a specified period (e.g., 14 days).2. Generate buy/sell signals based on the ROC crossing predefined thresholds.3. Backtest the strategy using historical data.Let's implement and backtest this momentum strategy on the provided dataset. After backtesting the momentum strategy using the Rate of Change (ROC) on the dataset:The final balance at the end of the period is approximately -$8,154.72.This results in a loss of approximately $108,154.72.The outcome suggests that the strategy did not perform well on this dataset during the given timeframe. The chosen threshold or the ROC period might not have been optimal for this specific stock.To gain a better understanding, let's visualize the stock price, the ROC values, and the buy/sell points.The visualization provides insights into the momentum strategy using the Rate of Change (ROC):1. The top graph displays the stock's closing price (in blue). Buy signals are shown with green upward-pointing triangles, while sell signals are denoted by red downward-pointing triangles.2. The bottom graph visualizes the ROC values (in purple) with a black zero line for reference.Observations:The ROC indicates the percentage change in price over the specified period (in this case, 14 days). Positive values suggest upward momentum, while negative values indicate downward momentum.The strategy generated buy signals when the ROC was positive and sell signals when the ROC was negative.However, there were instances where the price continued its trend even after a buy/sell signal, leading to losses.Prompt 4: As a Stock Analyst specializing in breakout trading strategies, prepare a comprehensive report on APPLE stock. Identify periods where the stock price breaks out from a defined range, either reaching new highs or lows, hinting at a potential prolonged move in the breakout direction. Incorporate technical indicators like volume or relative strength to validate the strength of the breakout. Backtest identified breakout signals against historical data to evaluate the strategy's consistency. Present your findings with charts illustrating breakout points and accompanying volume or indicator data.Response: Breakout trading is a strategy where traders aim to enter the market when the price makes a breakout from a previously defined resistance or support level. The rationale behind this strategy is that when a significant resistance or support level is broken, it's likely a large number of traders will enter or exit their positions, leading to a sharp price movement. Breakout Trading Strategy:Resistance Level: This is a price level where the stock has had difficulty going above in the past. It acts as a ceiling from which the stock price struggles to break through.Support Level: This is a price level where the stock has had difficulty going below in the past. It acts as a floor where the stock price seems to bounce off.Trading Rules:Buy Signal: When the stock price breaks above the resistance level.Sell Signal: When the stock price breaks below the support level.To implement this strategy, we'll need to define our resistance and support levels. One common method is to use recent highs as resistance and recent lows as support. For our implementation, we'll use:Resistance Level: The highest price over the previous \( n \) days.Support Level: The lowest price over the previous \( n \) days.For this demonstration, we'll use \( n = 14 \) (a 14-day window) to define our resistance and support levels. After determining these levels, we'll generate buy/sell signals based on price breakouts and then backtest the strategy.Let's implement and backtest the breakout trading strategy on the provided dataset.It seems there was an error due to non-numeric data types in the "High" and "Low" columns. Let's first clean up the "High" and "Low" columns by converting them to float, and then we'll proceed with the breakout trading strategy.After backtesting the breakout trading strategy on the dataset:The final balance at the end of the period remains at $100,000.There's no profit or loss, which suggests that the strategy didn't trigger any trades.The breakout strategy is contingent on significant price movements that breach the defined resistance or support levels. It's possible that within the dataset's timeframe, there weren't any significant breakouts that met the strategy's criteria, or the breakout window (14 days) chosen was not optimal.To better understand the strategy's behavior, let's visualize the stock price, resistance, support levels, and the buy/sell points.The graph showcases the stock's closing price (in blue), the resistance level (in red), and the support level (in green). Buy signals are represented with green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:There are no visible buy or sell signals on the graph, which aligns with our backtesting results that no trades were executed during the period.The stock price seemed to oscillate within the defined resistance and support levels for the majority of the period, without significant breakouts.Prompt 5:As a Stock Analyst with a focus on technical strategies, draft an in-depth analysis of APPLE stock employing the moving average crossover technique. Concentrate on instances where a short-term moving average (e.g., 50-day) intersects a long-term moving average (e.g., 200-day), signaling potential buy or sell opportunities. For added clarity, chart these moving averages on APPLE's stock price graph, marking crossover points. Backtest observed crossover events against past data to determine the strategy's reliability and effectiveness. Conclude with a visualization that combines stock prices with moving average lines, emphasizing key crossover events.Response: The Moving Average Crossover strategy is one of the most widely recognized techniques in technical analysis. It involves two moving averages: one representing a short period and the other a longer period. Moving Average Crossovers Strategy:1. Short-term Moving Average: A moving average of a shorter period (e.g., 50 days).2. Long-term Moving Average: A moving average of a longer period (e.g., 200 days).Trading Rules:1. Buy Signal (Golden Cross): When the short-term moving average crosses above the long-term moving average.2. Sell Signal (Death Cross): When the short-term moving average crosses below the long-term moving average.For this demonstration, we'll use:Short-term Moving Average: 50-day moving average.Long-term Moving Average: 200-day moving average.We'll calculate these moving averages, identify the crossover points, and backtest the strategy on the provided dataset.Using the Moving Average Crossover strategy:The final balance at the end of the period remains at $100,000.There's no profit or loss, which implies that the strategy didn't execute any trades during the period.This outcome suggests that there weren't any significant crossovers between the 50-day and 200-day moving averages during the dataset's timeframe. To better understand the strategy's behavior, let's visualize the stock price, the two moving averages, and any potential crossover points. The graph showcases the stock's closing price (in blue), the 50-day moving average (in green), and the 200-day moving average (in red). Golden Crosses (potential buy signals) are represented with green upward-pointing triangles, while Death Crosses (potential sell signals) are shown as red downward-pointing triangles.Observations:It's evident that there were no Golden Cross or Death Cross signals during the period, confirming our backtesting results.The 50-day moving average and 200-day moving average seem to move closely together throughout the dataset's timeframe without crossing each other.ConclusionSo, after navigating the vast sea of stock trading, it's evident that people have a variety of ways to approach their decisions. From sailing with the direction of the market to waiting patiently for those significant shifts, everyone's got their own strategy. It's a bit like choosing between hiking trails; some prefer the scenic route while others opt for the straightforward path. And with the help of modern tools like CHATGPT's data analysis, making these choices becomes a tad simpler. It's like having a handy guidebook for a complex journey. By understanding these methods and using available resources, anyone can navigate the stock market more confidently. It's all about finding what works best for you and sticking to it.Dr. Anshul Saxena is an author, corporate consultant, inventor, and educator who assists clients in finding financial solutions using quantum computing and generative AI. He has filed over three Indian patents and has been granted an Australian Innovation Patent. Anshul is the author of two best-selling books in the realm of HR Analytics and Quantum Computing (Packt Publications). He has been instrumental in setting up new-age specializations like decision sciences and business analytics in multiple business schools across India. Currently, he is working as Assistant Professor and Coordinator – Center for Emerging Business Technologies at CHRIST (Deemed to be University), Pune Lavasa Campus. Dr. Anshul has also worked with reputed companies like IBM as a curriculum designer and trainer and has been instrumental in training 1000+ academicians and working professionals from universities and corporate houses like UPES, CRMIT, and NITTE Mangalore, Vishwakarma University, Pune & Kaziranga University, and KPMG, IBM, Altran, TCS, Metro CASH & Carry, HPCL & IOC. With a work experience of 5 years in the domain of financial risk analytics with TCS and Northern Trust, Dr. Anshul has guided master's students in creating projects on emerging business technologies, which have resulted in 8+ Scopus-indexed papers. Dr. Anshul holds a PhD in Applied AI (Management), an MBA in Finance, and a BSc in Chemistry. He possesses multiple certificates in the field of Generative AI and Quantum Computing from organizations like SAS, IBM, IISC, Harvard, and BIMTECH.Author of the book: Financial Modeling Using Quantum Computing
Read more
  • 0
  • 0
  • 111
article-image-how-large-language-models-reshape-trading-stats
Anshul Saxena
21 Sep 2023
15 min read
Save for later

How Large Language Models Reshape Trading Stats

Anshul Saxena
21 Sep 2023
15 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionStock analysis is not just about numbers; it's a sophisticated dance of interpretation and prediction. Advanced techniques, such as the ones discussed here, offer deeper insights into the world of stocks. The journey begins with Volatility Analysis, utilizing Rolling Standard Deviation to grasp the extent of stock price movements, offering a window into the stock's inherent risk. Predictive Modeling then takes the stage, harnessing past data to provide a lens into potential future stock prices. Yet, for any analysis to stand on solid ground, the data itself must be pristine. This is where Data Cleaning comes into play, meticulously weeding out inconsistencies and errors, and laying a trustworthy foundation. To make sense of this vast sea of data, Visualizations transform intricate datasets into digestible charts and graphs, bridging the gap between numbers and narratives. Now, with these advanced steps in view, it's time to Dive In. This guide is designed for those keen to delve into a more profound layer of stock analysis, be it beginners eager to learn or veterans aiming to hone their skills.Step 1. Volatility Analysis gauges stock price fluctuations using Rolling Standard Deviation.Step 2. Data Cleaning: Ensure the data is free of inconsistencies and errors before diving into analysis.Step 3. Visualizations: Create intuitive charts and graphs to represent data, making it easily comprehensible at a glance.Step 4. Heat Map: Utilize heatmaps to visualize monthly average closing prices, helping traders spot recurring patterns, seasonality, and stock price shifts for informed investment decisions.Step 5. Predictive Modeling: Leverage historical data to attempt predictions of future stock prices.You can access the data here (Link)Step 1: Volatility Analysis - Rolling Standard DeviationVolatility Analysis employs the Rolling Standard Deviation to measure the consistency of stock price fluctuations over time.Prompt 1: As a diligent stock analyst, recognizing and quantifying a stock's volatility is paramount in understanding its risk and potential reward. To delve into this aspect, let's employ the rolling standard deviation methodology on the daily returns. Using a 21-day window - a conventional choice for monthly volatility assessments - will yield a dynamic depiction of the stock's volatility. The subsequent plot showcasing the 21-day rolling volatility essentially captures the standard deviation of the stock's daily returns over this time frame. Such a visualization can aid investors in discerning periods of heightened uncertainty or stability, assisting in refined decision-making.Response: Absolutely, the rolling standard deviation of daily returns is a commonly used metric to assess the volatility of a stock. A higher standard deviation signifies greater volatility, while a lower value indicates stability.To assess the stock's volatility:1. Compute the rolling standard deviation of the daily returns using a 21-day window.2. Plot the resulting 21-day rolling volatility.Let's proceed with these steps.The graph showcases the 21-day rolling volatility of the stock, which is essentially the standard deviation of the stock's daily returns over a 21-day window.# Compute the 21-day rolling standard deviation of the daily returns data['21-day Volatility'] = data['Daily Returns'].rolling(window=21).std() # Plotting the 21-day rolling volatility plt.figure(figsize=(15, 7)) plt.plot(data.index, data['21-day Volatility'], label='21-day Volatility', color='purple') plt.title("21-day Rolling Volatility") plt.xlabel("Date") plt.ylabel("Volatility (Standard Deviation of Daily Returns)") plt.legend() plt.grid(True) plt.tight_layout() plt.show() Interpretation: The chart shows the stock's ups and downs over time using a purple line. When this line peaks, it means the stock had a lot of movement, possibly because of big company news or changes in the wider market. On the other hand, when the line is low, the stock was more stable. While some investors might see these peaks as chances to make quick gains, others might prefer the steadier, low points. Simply put, this chart helps investors decide when to buy or sell based on how much the stock is expected to move.After analyzing the volatility, let’s clean the data  for further analysisStep 2: Data Cleaning and TransformationData Cleaning and Transformation involves addressing missing values, normalizing or standardizing data for modeling purposes, and ensuring date uniformity for a robust time series analysis by setting it as the index.Prompt 2: As a meticulous stock analyst, it's crucial to ensure the integrity and completeness of our dataset before deep-diving into further analysis. Start by scanning the dataset for any missing values, and addressing them appropriately, either through imputation or deletion based on the dataset's nature and the missing data's extent. Moreover, consistency is key; hence, confirm that the "Date" column adheres to a uniform format. This foundational data preparation step ensures the reliability of our subsequent analyses and insights.Response: Absolutely, ensuring the data's integrity and completeness is a foundational step in any analysis. To maintain the reliability of our subsequent analyses, we'll:1. Scan the dataset for missing values.2. Address these missing values, deciding on either imputation or deletion based on the nature and extent of the missing data.3. Confirm that the "Date" column is in a consistent format.Let's start by checking for missing values in our dataset.# Check for missing values in the dataset missing_values = data.isnull().sum() missing_valuesThe dataset has missing values in the following columns:1. Daily Returns: 1 missing value - This is expected since the first entry won't have a previous day to compute the return from.2. 50-day MA: 49 missing values - The 50-day moving average will have missing values for the initial 49 days since there aren't enough data points to compute the average.3. 200-day MA: 199 missing values - Similarly, the 200-day moving average will lack values for the initial 199 days.4. 21-day Volatility: 21 missing values - The 21-day rolling volatility will be absent for the initial 21 days.Given the nature of these missing values (they're a result of our calculations and are expected due to the rolling window computations), we don't need to impute them. However, for some analyses, we might want to focus on dates after these initial missing values to ensure continuity.Next, let's ensure the "Date" column (which is now our index) is consistent in format.# Check the data type and consistency of the "Date" column (which is now the index) date_format_consistency = data.index date_format_consistency The "Date" column, which is now our index, is of type `datetime64[ns]`. This confirms that the dates are in a consistent datetime format, ensuring that our subsequent analyses are accurate and reliable.Step 3: Visualizations - Distribution PlotsVisualizations encompass the use of candlestick plots to intricately depict stock price movements, distribution plots to showcase price and volume variances, and heatmaps to represent monthly or yearly average stock prices.Prompt 3: In your role as a stock analyst, visualizing the distribution of data can offer profound insights into its behavior and characteristics. Now, direct your attention to the distribution plots of both the stock prices (specifically the Close/Last prices) and the trading volume. These plots will elucidate the spread, skewness, peaks, and central tendencies of the data.1. Distribution of Close/Last Prices: By examining this plot, you can discern the most frequent closing prices, identify any outliers, and understand the distribution's symmetry or skewness. Does it lean towards higher or lower prices? Are there any significant peaks?2. Distribution of Trading Volume: This visualization will shed light on the typical trading volume levels. Are there days of extraordinarily high or low trading volumes? Does the distribution hint at a particular trend in trading activity?Through these plots, you're better positioned to understand the stock's price movements and the market's interest in it, guiding future analysis or investment recommendations.Response: To visualize the distribution of data, we'll employ distribution plots (or histograms) for both the stock's closing prices and the trading volume.1. Distribution of Close/Last Prices: This will help us understand the frequency distribution of the closing prices, offering insights into the most common price levels, the data's symmetry or skewness, and potential outliers.2. Distribution of Trading Volume: This will shed light on the frequency distribution of trading volumes, allowing us to discern common volume levels and any anomalies in trading activity.Let's begin by visualizing the distribution of the stock's Close/Last prices.# Importing necessary libraries for visualization import seaborn as sns # Setting up the figure and axes fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(15, 10)) # Distribution of Close/Last Prices sns.histplot(data['Close/Last'], bins=50, kde=True, ax=ax[0], color='blue') ax[0].set_title("Distribution of Close/Last Prices") ax[0].set_xlabel("Close/Last Price ($)") ax[0].set_ylabel("Frequency") # Distribution of Trading Volume sns.histplot(data['Volume'], bins=50, kde=True, ax=ax[1], color='gray') ax[1].set_title("Distribution of Trading Volume") ax[1].set_xlabel("Trading Volume") ax[1].set_ylabel("Frequency") plt.tight_layout() plt.show() Interpretation: The plots show how often the stock closes at different prices and how its trading volume varies. For closing prices, most often they land in a certain range, but sometimes they go a bit higher. There don't seem to be any unusual values, but a different chart could confirm this. The trading volume, or how much the stock is traded, usually stays within a specific range. However, there are rare days when it's traded a lot, possibly due to big news about the company. In short, these charts give a snapshot of the stock's usual behavior and any deviations from it, helping guide investment choices.Step 4: Visualizations – HeatmapHeat Map visualization showcases monthly average closing prices, empowering traders to identify recurring patterns, discern seasonality, and detect stock price shifts, thereby guiding informed investment choices.Prompt 9: As an astute stock analyst, appreciating the power of visualizations to detect patterns is vital. To enhance our understanding of the stock's behavior over time, let's craft a heatmap that displays the monthly average closing prices. Heatmaps, with their gradient color schemes, provide an intuitive way to quickly spot variations and trends. By focusing on monthly averages, this visualization will surface any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices. Such insights can be instrumental for traders and investors seeking to capitalize on consistent temporal trends in the stock market.Response: The heatmap showcases the monthly average closing prices for the stock:# Define the price columns price_columns = ['Close/Last', 'Open', 'High', 'Low'] # Convert the price columns to float values by removing the '$' sign and converting to float for col in price_columns:    data[col] = data[col].str.replace('$', '', regex=False).astype(float) # Set the 'Date' column as the index data.set_index('Date', inplace=True) # Compute the daily returns data['Daily Returns'] = data['Close/Last'].pct_change() # Resample the data for monthly average and transform for heatmap monthly_avg = data['Close/Last'].resample('M').mean() heatmap_data = monthly_avg.groupby([monthly_avg.index.year, monthly_avg.index.month]).mean().unstack() # Plotting the heatmap plt.figure(figsize=(12, 6)) sns.heatmap(heatmap_data, cmap="YlGnBu", annot=True, fmt=".2f", cbar_kws={'label': 'Average Closing Price ($)'}) plt.title("Monthly Average Closing Prices") plt.xlabel("Month") plt.ylabel("Year") plt.show()  Interpretation:Color Gradient: The color gradient, ranging from light to dark shades of blue, represents the average closing price. Darker shades indicate higher average closing prices for that particular month, while lighter shades suggest lower averages.Annotations: Each cell in the heatmap is annotated with the specific average closing price for that month and year.By examining this heatmap:Patterns & Seasonality: You can spot any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices over the years.Variations: The color variations can quickly help identify months with particularly high or low average closing prices.This visualization provides a comprehensive and intuitive view of the stock's price movements on a monthly basis, aiding traders and investors in spotting consistent temporal trends and guiding future analyses or investment strategies. Now let’s look into the future trend.Step 5: Predictive Modeling (Time Series Forecasting)In the realm of stock analysis, forecasting is crucial to anticipate market movements. Starting with chronological time series data, one should ensure its stationarity, utilize ACF and PACF for ARIMA parameter selection, then train, forecast, visualize the results, and continuously validate and refine, ensuring well-informed investment decisions.Prompt 5: As a dedicated stock analyst, I understand the nuances and intricacies of time series forecasting. When working with stock price data, the ARIMA (Autoregressive Integrated Moving Average) model can be an invaluable tool. Begin with your time series data, ensuring chronological ordering and handling any missing values. It's crucial to ensure the data's stationarity, as ARIMA thrives on stationary datasets. This can be validated visually or with tests like the Augmented Dickey-Fuller. Diving deeper, ARIMA's efficacy lies in its parameters: \(p\) for the lagged values, \(d\) for differencing, and \(q\) for the moving average term. Helpful tools like ACF and PACF plots can guide the selection of these parameters. Once defined, train your ARIMA model on the dataset. With this trained model, predict future stock prices for your desired time frame. Visual representation can be achieved by plotting the original time series alongside the ARIMA predictions. For a more holistic view, overlay the forecast's confidence intervals. Remember, in the world of stock forecasting, model validation is paramount. Compare your model's predictions against a holdout set if available, employing metrics such as MAE or RMSE. Continual refinement might be necessary, so be prepared to iterate and adjust as needed for the most accurate forecasting.Response: We can fit the ARIMA model to our data. However, it's worth noting that multiple combinations of \(p\), \(d\), and \(q\) might be suitable. Model selection criteria, like AIC or BIC, can help refine our choices.Let's proceed by fitting the ARIMA model using the initial parameters and then forecast future stock prices.from statsmodels.tsa.arima.model import ARIMA from datetime import timedelta # Initial parameters based on ACF and PACF plots p, d, q = 1, 1, 1 # Fit the ARIMA model model = ARIMA(data['Close/Last'], order=(p, d, q)) model_fit = model.fit() # Forecast the next 30 days forecast_steps = 30 forecast = model_fit.get_forecast(steps=forecast_steps) forecast_index = [data.index[-1] + timedelta(days=i) for i in range(1, forecast_steps + 1)] # Extract the forecast mean and confidence intervals forecast_mean = forecast.predicted_mean confidence_intervals = forecast.conf_int() # Plot the original series and the forecast plt.figure(figsize=(15, 7)) data['Close/Last'].plot(label='Observed', color='blue') plt.plot(forecast_index, forecast_mean, color='red', label='Forecast') plt.fill_between(forecast_index, confidence_intervals.iloc[:, 0], confidence_intervals.iloc[:, 1], color='pink', alpha=0.3) plt.title('Stock Closing Price Forecast using ARIMA') plt.xlabel('Date') plt.ylabel('Close Price ($)') plt.legend() plt.grid(True) plt.tight_layout() plt.show() Interpretation: The chart shows past stock prices (in blue) and what they might be in the next 30 days (in red), based on a prediction model called ARIMA. Around the red line is a pink area that suggests where the stock prices might realistically land. However, it's vital to remember that this is just an estimate. Predicting stock prices is tricky because so many different things can affect them. As time goes on, the pink area gets broader, meaning the predictions are less certain. While this model offers a glimpse into potential future prices, always be cautious when basing decisions on predictions, as the stock market is full of surprises.ConclusionStock analysis, often seen as a realm of pure numbers, is actually a delicate blend of art and science, interpretation paired with prediction. As we've journeyed through, advanced techniques like Volatility Analysis have provided clarity on the unpredictable nature of stocks, while Data Cleaning ensures that our foundation is rock-solid. Visual tools, especially intuitive heatmaps, act as a compass, highlighting subtle patterns and variations in monthly stock prices. At the heart of it all, Predictive Modeling stands as a beacon, illuminating potential future paths using the wisdom of past data. Whether one is just stepping into this vast ocean or is a seasoned navigator, the tools and techniques discussed here not only simplify the journey but also enhance the depth of understanding. In stock analysis, as in many fields, knowledge is power, and with these methods in hand, both newcomers and experts are well-equipped to make informed, strategic decisions in the dynamic world of stocks.Author BioDr. Anshul Saxena is an author, corporate consultant, inventor, and educator who assists clients in finding financial solutions using quantum computing and generative AI. He has filed over three Indian patents and has been granted an Australian Innovation Patent. Anshul is the author of two best-selling books in the realm of HR Analytics and Quantum Computing (Packt Publications). He has been instrumental in setting up new-age specializations like decision sciences and business analytics in multiple business schools across India. Currently, he is working as Assistant Professor and Coordinator – Center for Emerging Business Technologies at CHRIST (Deemed to be University), Pune Lavasa Campus. Dr. Anshul has also worked with reputed companies like IBM as a curriculum designer and trainer and has been instrumental in training 1000+ academicians and working professionals from universities and corporate houses like UPES, CRMIT, and NITTE Mangalore, Vishwakarma University, Pune & Kaziranga University, and KPMG, IBM, Altran, TCS, Metro CASH & Carry, HPCL & IOC. With a work experience of 5 years in the domain of financial risk analytics with TCS and Northern Trust, Dr. Anshul has guided master's students in creating projects on emerging business technologies, which have resulted in 8+ Scopus-indexed papers. Dr. Anshul holds a PhD in Applied AI (Management), an MBA in Finance, and a BSc in Chemistry. He possesses multiple certificates in the field of Generative AI and Quantum Computing from organizations like SAS, IBM, IISC, Harvard, and BIMTECH.Author of the book: Financial Modeling Using Quantum Computing
Read more
  • 0
  • 0
  • 111

article-image-demystifying-azure-openai-service
Olivier Mertens, Breght Van Baelen
15 Sep 2023
16 min read
Save for later

Demystifying Azure OpenAI Service

Olivier Mertens, Breght Van Baelen
15 Sep 2023
16 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Azure Data and AI Architect Handbook, by Olivier Mertens and Breght Van Baelen. Master core data architecture design concepts and Azure Data & AI services to gain a cloud data and AI architect’s perspective to developing end-to-end solutionsIntroductionOpenAI has risen immensely in popularity with the arrival of ChatGPT. The company, which started as an on-profit organization, has been the driving force behind the GPT and DALL-E model families, with intense research at a massive scale. The speed at which new models get released and become available on Azure has become impressive lately.Microsoft has a close partnership with OpenAI, after heavy investments in the company from Microsoft . The models created by OpenAI use Azure infrastructure for development and deployment. Within this partnership, OpenAI carries the responsibility of research and innovation, coming up with new models and new versions of their existing models. Microsoft manages the enterprise-scale go-to-market. It provides infrastructure and technical guidance, along with reliable SLAs, to get large organizations started with the integrations of these models, fine-tuning them on their own data, and hosting a private deployment of the models.Like the face recognition model in Azure Cognitive Services, powerful LLMs such as the ones in Azure OpenAI Service could be used to cause harm at scale. Therefore, this service is also gated according to Microsoft ’s guidelines on responsible AI.At the time of writing, Azure OpenAI Service offers access to the following models: GPT model family * GPT-3.5   * GPT-3.5-Turbo (the model behind ChatGPT  * GPT-4CodexDALL-E 2Let’s dive deeper into these models.The GPT model familyGPT models, which stands for generative pre-trained transformer models, made their first appearance in 2018, with GPT-1, trained on a dataset of roughly 7,000 books. This made good advancements in performance at the time, but the model was already vastly outdated a couple of years later. GPT-2 followed in 2019, trained on the WebText dataset (a collection of 8 million web pages). In 2020, GPT-3 was released, trained on the WebText dataset, two book corpora, and  English Wikipedia.In these years, there were no major breakthroughs in terms of efficient algorithms , but rather, in the scale of the architecture and datasets. This becomes easily visible when we look at the growing number of parameters used for every new generation of the model, as shown in the following figure.Figure 9.3 – A visual comparison between the sizes of the different generations of GPT models, based on their trainable parametersThe question is often raised of how to interpret this concept of parameters. An easy analogy is the number of neurons in a brain. Although parameters in a neural network are not equivalent to its artificial neurons, the number of parameters and neurons are heavily correlated – more parameters = more neurons. The more neurons there are in the brain, the more knowledge it can grasp.Since the arrival of GPT-3, we have seen two major adaptations of the third-generation model being made. The first one is GPT-3.5. This model has a similar architecture as the GPT-3 model but is trained on text and code, whereas the original GPT-3 only saw text data during training. Therefore, GPT-3.5 is capable of generating and understanding code. GPT-3.5, in turn, became the basis for the next adaptation, the vastly popular ChatGPT model. This model has been fine-tuned for conversational usage while using additional reinforcement learning to get a sense of ethical behavior.GPT model sizesThe OpenAI models are available in different sizes, which are all named after remarkable scientists. The GPT-3.5 model specifically, is  available in four versions:AdaBabbage CurieDavinciThe Ada model is the smallest, most lightweight model, while Davinci is the most complex and most performant model. The larger the model, the more expensive it is to use, host, and fine-tune, as shown in Figure 9.4. As a side note, when you hear about the absurd number of parameters of new GPT models, this usually refers to the Davinci model.Figure 9.4 – A trade-off exists between lightweight, cheap models and highly performant, complex modelsWith a trade-off between costs and performance available, an architect can start thinking about which model size may best fit a solution. In reality, this often comes down to empirical testing. If the cheaper model can perform the job at an acceptable performance, then this is the more cost-effective solution. Note that when talking about performance in this scenario, we mean predictive power, not the speed at which the model makes predictions. The larger models will be slower to output a prediction than the lightweight models.Understanding the difference between GPT-3.5 and GPT-3.5-Turbo (ChatGPT)GPT-3.5 and GPT-3.5-Turbo are both models used to generate natural language text, but they are used in different ways. GPT-3.5 is classified as a text completion model, whereas GPT-3.5-Turbo is referred to as conversational AI.To better understand the contrast between the two models, we first need to introduce the concept of contextual learning. These models are trained to understand the structure of the input prompt to provide a meaningful answer. Contextual learning is often split up into few-shot learning, one-shot learning, and zero-shot learning. Shot, in this context, refers to an example given in the input prompt. With few-shot learning, we provide multiple examples in the input prompt, one-shot learning provides a single example, and zero-shot indicates that no examples are given. In the case of the latter, the model will have to figure out a different way to understand what is being asked of it (such as interpreting the goal of a question).Consider the following example:Figure 9.5 – Few-shot learning takes up the most amount of tokens and requires more effort but often results in model outputs of higher qualityWhile it takes more prompt engineering effort to apply few-shot learning, it will usually yield better results. A text completion model, such as GPT-3.5, will perform vastly better on few-shot learning than one-shot or zero-shot. As the name suggests, the model figures out the structure of the input prompt (i.e., the examples) and completes the text accordingly.Conversational AI, such as ChatGPT, is more performant in zero-shot learning. In the case of the preceding example, both models are able to output the correct answer, but as questions become more and more complex, there will be a noticeable difference in predictive performance. Additionally, GPT-3.5-Turbo will remember information from previous input prompts, whereas GPT-3.5 prompts are handled independently.Innovating with GPT-4With the arrival of GPT-4, the focus has shifted toward multimodality. Multimodality in AI refers to the ability of an AI system to process and interpret information from multiple modalities, such as text, speech, images, and videos. Essentially, it is the capability of AI models to understand and combine data from different sources and formats.GPT-4 is capable of additionally taking images as input and interpreting them. It has stronger reasoning and overall performance than its predecessors. There was a famous example where GPT-4 was able to deduce that balloons would fly upward when asked what would happen if someone cut the balloons' strings, as shown in the following photo.Figure 9.6 – The image in question that was used in the experiment. When asked what would happen if the strings were cut, GPT-4 replied that the balloons would start flying awaySome adaptations of GPT-4, such as the one used in Bing Chat, have the extra feature of citing sources in generated answers. This is a welcome addition, as hallucination was a significant flaw in earlier GPT models.HallucinationHallucination in the context of AI refers to generating wrong predictions with high confidence. It is obvious that this can cause a lot more harm than the model indicating it is not sure how to respond or knowing the answer.Next, we will look at the Codex model.CodexCodex is a model that is architecturally similar to GPT-3, but it fully focuses on code generation and understanding. Furthermore, an adaptation of Codex forms the underlying model for GitHub Copilot, a tool that provides suggestions and auto-completion for code based on context and natural language inputs, available for various integrated development environments (IDEs) such as Visual Studio Code. Instead of a ready-to-use solution, Codex is (like the other models in Azure OpenAI) available as a model endpoint and should be used for integration in custom apps.The Codex model is initially trained on a collection of 54 million code repositories, resulting in billions of lines of code, with the majority of training data written in Python.Codex can generate code in different programming languages based on an input prompt in natural language (text-to-code), explain the function of blocks of code (code-to-text), add comments to code, and debug existing code.Codex is available as a C (cushman) and D (Davinci) model. Lightweight Codex models (A series or B series) currently do not exist.Models such as Codex or GitHub Copilot are a great way to boost the productivity of software engineers, data analysts, data engineers, and data scientists.  They do not replace these roles, as their accuracy is not perfect; rather, they give engineers the opportunity to start editing from a fairly well-written block of code instead of coding from scratch.DALL-E 2The DALL-E model family is used to generate visuals. By providing a  description in natural language in the input prompt, it generates a series of matching images. While other models are often used at scale in large enterprises, DALL-E 2 tends to be more popular in smaller businesses. Organizations that lack an in-house graphic designer can make great use of DALL-E to generate visuals for banners, brochures, emails, web pages, and so on.DALL-E 2 only has a  single model size to choose from, although open-source alternatives exist if a lightweight version is preferred. Fine-tuning and private deploymentsAs a data architect, it is important to understand the cost structure of these models. The first option is to use the base model in a serverless manner. Similar to how we work with Azure Cognitive Services, users will get a key for the model’s endpoint and simply pay per prediction. For DALL-E 2, costs are incurred per 100 images, while the GPT and Codex models are priced per 1,000 tokens. For every request made to a GPT or Codex model, all tokens of the input prompt and the output are added up to determine the cost of the prediction.TokensIn natural language processing, a token refers to a sequence of characters that represents a distinct unit of meaning in a text. These units do not necessarily correspond to words, although for short words, this is mostly the case. Tokens are used as the basic building blocks to process and analyze text data. A good rule of thumb for the English language is that one token is, on average, four characters. Dividing your total character count by four will make a good estimate of the number of tokens.Azure OpenAI Service also grants extensive fine-tuning functionalities. Up to 1 GB of data can be uploaded per Azure OpenAI instance for fine-tuning. This may not sound like a lot, but note that we are not training a new model from scratch. The goal of fine-tuning is to retrain the last few layers of the model to increase performance on specific tasks or company-specific knowledge. For this process, 1 GB of data is more than sufficient.When adding a fine-tuned model to a solution, two additional costs will be incurred. On top of the token-based inference cost, we need to take into account the training and hosting costs. The hourly training cost can be quite high due to the amount of hardware needed, but compared to the inference and hosting costs during a model’s life cycle, it remains a small percentage. Next, since we are not using the base model anymore and, instead, our own “version” of the model, we will need to host the model ourselves, resulting in an hourly hosting cost.Now that we have covered both pre-trained model collections, Azure Cognitive Services, and Azure OpenAI Service, let’s move on to custom development using Azure Machine Learning.Grounding LLMsOne of the most popular use cases for LLMs involves providing our own data as context to the model (often referred to as grounding). The reason for its popularity is partly due to the fact that many business cases can be solved using a consistent technological architecture. We can reuse the same solution, but by providing different knowledge bases, we can serve different end users.For example, by placing an LLM on top of public data such as product manuals or product specifics, it is easy to develop a customer support chatbot. If we swap out this knowledge base of product information with something such as HR documents, we can reuse the same tech stack to create an internal HR virtual assistant.A common misconception regarding grounding is that a model needs to be trained on our own data. This is not the case. Instead, after a user asks a question, the relevant document (or paragraphs) is injected into the prompt behind the scenes and lives in the memory of the model for the duration of the chat session (when working with conversational AI) or for a single prompt. The context, as we call it, is then wiped clean and all information is forgotten. If we wanted to cache this info, it is possible to make use of a framework such as LangChain or Semantic Kernel, but that is out of the scope of this book.The fact that a model does not get retrained on our own data plays a crucial role in terms of data privacy and cost optimization. As shown before in the section on fine-tuning, as soon as a base model is altered, an hourly operating cost is added to run a private deployment of the model. Also, information from the documents cannot be leaked to other users working with the same model.Figure 9.7 visualizes the architectural concepts to ground an LLM.Figure 9.7 – Architecture to ground an LLMThe first thing to do is turn the documents that should be accessible to the model into embeddings. Simply put, embeddings are mathematical representations of natural language text. By turning text into embeddings, it is possible to accurately calculate the similarity (from a semantics perspective) between two pieces of text.To do this, we can leverage Azure Functions, a service that allows pieces of code to run in a serverless function. It often forms the glue between different components by handling interactions. In this case, an Azure function (on the bottom left of Figure 9.7) will grab the relevant documents from the knowledge base, break them up into chunks (to accommodate for the maximum token limits of the model), and generate an embedding for each one. This embedding is then stored, alongside the natural language text, in a vector database. This function should be run for all historic data that will be accessible to the model, as well as triggered for every new, relevant document that is added to the knowledge base.Once the vector database is in place, users can start asking questions. However, the user questions are not directly sent to the model endpoint. Instead, another Azure function (shown at the top of Figure 9.7) will turn the user question into an embedding and check its similarity of it with the embeddings of the documents or paragraphs in the vector database. Then, the top X most relevant text chunks are injected into the prompt as context, and the prompt is sent over to the LLM. Finally, the response is returned to the user.ConclusionAzure OpenAI Service, a collaboration between OpenAI and Microsoft, delivers potent AI models. The GPT model family, from GPT-1 to GPT-4, has evolved impressively, with GPT-3.5-Turbo (ChatGPT) excelling in conversational AI. GPT-4 introduces multimodal capabilities, comprehending text, speech, images, and videos.Codex specializes in code generation, while DALL-E 2 creates visuals from text descriptions. These models empower developers and designers. Customization via fine-tuning offers cost-effective solutions for specific tasks. Leveraging Azure OpenAI Service for your projects enhances productivity.Grounding language models with user data ensures data privacy and cost efficiency. This collaboration holds promise for innovative AI applications across various domains.Author BioOlivier Mertens is a cloud solution architect for Azure data and AI at Microsoft, based in Dublin, Ireland. In this role, he assisted organizations in designing their enterprise-scale data platforms and analytical workloads. Next to his role as an architect, Olivier leads the technical AI expertise for Microsoft EMEA in the corporate market. This includes leading knowledge sharing and internal upskilling, as well as solving highly complex or strategic customer AI cases. Before his time at Microsoft, he worked as a data scientist at a Microsoft partner in Belgium.Olivier is a lecturer for generative AI and AI solution architectures, a keynote speaker for AI, and holds a master’s degree in information management, a postgraduate degree as an AI business architect, and a bachelor’s degree in business management.Breght Van Baelen is a Microsoft employee based in Dublin, Ireland, and works as a cloud solution architect for the data and AI pillar in Azure. He provides guidance to organizations building large-scale analytical platforms and data solutions. In addition, Breght was chosen as an advanced cloud expert for Power BI and is responsible for providing technical expertise in Europe, the Middle East, and Africa. Before his time at Microsoft, he worked as a data consultant at Microsoft Gold Partners in Belgium.Breght led a team of eight data and AI consultants as a data science lead. Breght holds a master’s degree in computer science from KU Leuven, specializing in AI. He also holds a bachelor’s degree in computer science from the University of Hasselt.
Read more
  • 0
  • 0
  • 110