Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials

7020 Articles
article-image-ai-distilled-31-evolving-boundaries-and-opportunities
Merlyn Shelley
08 Jan 2024
14 min read
Save for later

AI_Distilled #31: Evolving Boundaries and Opportunities

Merlyn Shelley
08 Jan 2024
14 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,🎉 Joyous 2024! Wishing you a year as delightful as your dreams!  Dive into the new year with our outstanding edition, filled with essential features to boost your AI practice. “The speed at which people will be able to come up with an idea, to test the idea, to make something, it’s going to be so accelerated…You don’t need to have a degree in computer science to do that.” - Matthew Candy, IBM’s global managing partner for generative AI Coding without coding is a revolutionary idea indeed. What might have been previously perceived as unbelievable is a living reality and new features like Github’s Copilot Chat make it all the more seamless.  The real possibilities of AI expand far beyond computing, with the technology making waves in healthcare, finance, and supply chain management. Starting from this edition, we’ll bring you fresh updates from each of these sectors, so stay tuned! Let's kick things off by tapping into the latest news and developments. AI Launches & Industry Updates:  Microsoft Copilot Integrates with GenAI Music App Suno for Song Composition Google Plans Potential Layoffs Amidst AI Integration in Ad Sales GitHub Expands Copilot Chat Availability for Developers AI in Healthcare: AI Streamlining Health Insurance Shopping Process Revolutionizing Healthcare with AI Stethoscope on Smartphones Generative AI's Impact on Mental Health Counseling AI in Finance: Next-Gen Banks to Leverage AI for Financial Influence and Support Invest Qatar Introduces Cutting-Edge Azure OpenAI GPT-Powered Chatbot AI in Supply Chain Management: AI Safeguards Supply Chains Amidst Holiday Challenges AI-SaaS Integration Revolutionizes E-commerce Analytics Here are some handpicked GPT and LLM resources, tutorials, and secret knowledge that’ll come in handy for your next project: Understanding the Prompt Development Life Cycle Building Platforms with LLMs: Overcoming Challenges in Summarization as a Service Understanding the Risks of Prompt Injection in LLM Applications Creating an Open Source LLM Recommender System: Mastering Prompt Iteration and Optimization Looking for hands-on tips and strategies straight from the developer community? We’ve got you covered: Exploring Google's Gemini Pro Vision LLM with Javascript: A Practical Guide Accelerating AI Application Productionization: A Guide with SageMaker JumpStart, Amazon Bedrock, and TruEra Quantizing LLMs with Activation-aware Weight Quantization (AWQ) Unlocking Your MacBook's AI Potential: Running 70B LLM Models Without Quantization Check out our curated list of smoking hot GitHub repositories: Giskard-AI/giskard CopilotKit/CopilotKit chengzeyi/stable-fast ml-explore/mlx  📥 Feedback on the Weekly EditionQ: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF. Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisNew Launches & Industry Updates: ⭐ Microsoft Copilot Integrates with GenAI Music App Suno for Song Composition: Microsoft Copilot has partnered with GenAI music app Suno, enabling users to create complete songs including lyrics, instrumentals, and singing voices. Accessible via Microsoft Edge, the integration aims to make music creation inclusive and enjoyable. However, ethical and legal concerns persist, with some artists uncomfortable with AI algorithms learning from their work without consent or compensation. Suno attempts to address such issues by blocking certain prompts and preventing the generation of covers using existing lyrics. Read Microsoft’s official blog here. ⭐ Google Plans Potential Layoffs Amidst AI Integration in Ad Sales: Google is reportedly considering laying off around 30,000 employees within its ad sales division due to the implementation of internal AI, aiming for improved operational efficiency. The restructuring primarily targets the ad sales team, reflecting Google's exploration of AI benefits in operational processes. Earlier in 2023, Google had already laid off 12,000 employees, emphasizing the need for organizational adaptation amidst evolving global dynamics. Read about other significant 2023 layoffs here. ⭐ GitHub Expands Copilot Chat Availability for Developers: GitHub is extending the availability of Copilot Chat, a programming-centric chatbot powered by GPT-4, to all users. The tool was initially launched for Copilot for Business subscribers and later in beta for $10 per month users. Integrated into Microsoft's IDEs, Visual Studio Code and Visual Studio, it's included in GitHub Copilot's paid tiers and free for verified teachers, students, and maintainers of specific open-source projects. Developers can prompt Copilot Chat in natural language, seeking real-time guidance on code-related tasks. Know more about Copilot Chat here. AI in Healthcare: ⭐ AI Streamlining Health Insurance Shopping Process: Companies are utilizing AI to simplify the often complex and tedious task of shopping for health insurance, aiming to guide consumers to better and more affordable options. With many Americans sticking to their health plans due to the difficulty of predicting their future healthcare needs, AI-powered tools gather individual information and predict the most suitable health plans. Alight, a cloud-based HR services provider, reports that 95% of its served employers use AI technology, including a virtual assistant, for employee health benefits selection.  ⭐ Revolutionizing Healthcare with AI Stethoscope on Smartphones: A startup AI Health Highway is addressing the challenge of limited access to specialists in healthcare by introducing an innovative solution, AI Steth, which combines traditional stethoscope use with cutting-edge signal processing and AI. Targeting the early detection and prediction of heart and lung disorders, the device transforms sound patterns into visual representations on smartphones, allowing non-specialists like family physicians and nurses to examine patients effectively. AI Steth has shown exceptional accuracy in murmur detection, paving the way for more objective and efficient diagnoses. Discover AI Health Highway’s work here. ⭐ Generative AI's Impact on Mental Health Counseling: Generative AI is finding use in mental health counseling, sparking discussions about its potential to assist or even replace human therapists. Recent research testing ChatGPT on mental health counseling questions has raised questions about the technology's role in therapy. AI therapy has evolved from basic chatbots to sophisticated entities capable of nuanced emotional responses, offering accessible mental health support 24/7. While the benefits are evident, challenges such as risk, coverage, and ethical considerations must be addressed for responsible implementation.  AI in Finance: ⭐ Next-Gen Banks to Leverage AI for Financial Influence and Support: Experts predict that next-generation banks will harness generative AI to impact various aspects of financial services, ranging from influencing customer decisions to identifying vulnerable clients. Tom Merry, Head of Banking Strategy at Accenture, suggests that generative AI could significantly influence banking operations, touching nearly every aspect. While the UK banking industry has been utilizing AI for fraud detection and risk analysis, the introduction of generative AI, capable of creating novel solutions based on extensive data, is gaining traction. ⭐ Invest Qatar Introduces Cutting-Edge Azure OpenAI GPT-Powered Chatbot: Invest Qatar, in collaboration with Microsoft, has launched Ai.SHA, an innovative AI-powered chatbot utilizing GPT capabilities through the Azure OpenAI service. This move positions Invest Qatar as a pioneer among investment promotion agencies globally, embracing advanced technology to transform interactions between investors and businesses in Qatar. Ai.SHA acts as a comprehensive resource, providing information on business opportunities, the investment ecosystem, and business setup in Qatar.  AI in Supply Chain Management: ⭐ AI Safeguards Supply Chains Amidst Holiday Challenges: Businesses face unique challenges in managing complex supply chains amid the holiday season, from counterfeit airplane parts to recalls affecting festive foods. The reliance on suppliers underscores the need for transparency and visibility to prevent disruptions caused by supplier misconduct. Leveraging AI in contracts offers a solution, allowing businesses to streamline due diligence, enhance visibility, conduct predictive analytics, and align with environmental, social, and governance (ESG) regulations. AI-powered contracts emerge as vital tools to proactively address supply chain challenges and ensure customer trust during the holiday season and beyond. ⭐ AI-SaaS Integration Revolutionizes E-commerce Analytics: In the logistics sector, where precision and speed are critical, SaaS coupled with AI is transforming traditional approaches. This integration allows for real-time data processing and learning from it, offering unprecedented insights and optimization capabilities. Learn how AI-SaaS integration streamlines inventory, boosts operational efficiency, and fortifies against fraud, becoming the recipe for e-commerce success in a hypercompetitive landscape.  🔮 Expert Insights from Packt Community Architectural Patterns and Techniques for Developing IoT Solutions - By Jasbir Singh Dhaliwal Unique requirements of IoT use cases IoT use cases tend to have very unique requirements concerning power consumption, bandwidth, analytics, and more. Additionally, the inherent complexity of IoT implementations (computationally challenged field devices on one end of the spectrum vis-à-vis almost infinite capacity of the cloud on the other) forces architects to make difficult architectural decisions and implementation choices. Before presenting the various IoT patterns, it is worth mentioning the unique expectations from IoT architectures that are different from non-IoT architectures: Sensing events and actuation commands have a wide range of latency expectations – from real-time to fire and forget. Data analysis results need to be reported/visualized/consumed on a variety of consumer devices – mobiles, desktops, tablets, and more. Similarly, data consumers have diverse backgrounds, data needs, and application roles (personas). One is often forced to integrate with legacy as well as cutting-edge devices and/or external systems – very few trivial use cases have isolated/standalone architectures. There is a considerable difference in the way the data is extracted from legacy versus non-legacy systems – legacy systems may internally collate the data and then push it to the external port (file transfer), whereas newer systems may push the data in a continuous stream (time-series data). This variability is one of the critical considerations when choosing a particular IoT architectural pattern. Varied deployment requirements – edge, on-premise, hybrid, the cloud, and more. Adherence to strict regulatory compliances, especially in medical and aeronautical domains. There are expectations considering immediate payback, return on investment (ROI), business outcomes, and new service business models. Continuous innovation, which results in new services or offerings (especially by cloud vendors), forcing IoT architectures to be in continuous sync mode with these new offerings or services. This is an excerpt from the book Architectural Patterns and Techniques for Developing IoT Solutions written by Jasbir Singh Dhaliwal and published in Sep ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below.   Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ Understanding the Prompt Development Life Cycle: Explore PDLC and gain insights into how prompt engineering mirrors software development. The primer unfolds a step-by-step guide, beginning with the Initial Build phase where an imperfect prompt is crafted, incorporating techniques like zero-shot and few-shot. The Optimization stage strategically refines prompts based on historical data. Finally, the Fine-tune phase demonstrates the refinement of models, emphasizing the importance of continuous tracking. ⭐ Building Platforms with LLMs: Overcoming Challenges in Summarization as a Service: Get to know more about Summarization as a Service, a platform designed by a Microsoft team for Viva Engage. Learn about the complexities of prompt design, ensuring accuracy and grounding, addressing privacy and compliance concerns, managing performance, cost, and availability of LLM services, and integrating outputs seamlessly with the Copilot and other Viva Engage features.  ⭐ Understanding the Risks of Prompt Injection in LLM Applications: Explore the intricacies of prompt injection in LLM applications. The author emphasizes the critical security implications and potential impacts, citing the OWASP Top 10 for LLM Applications. Drawing parallels to injection vulnerabilities like A03 in traditional security, the article illustrates potential risks through a thought experiment involving a robotic server.  ⭐ Creating an Open Source LLM Recommender System: Mastering Prompt Iteration and Optimization: Open Recommender is an open-source YouTube video recommendation system adept at tailoring content to your interests based on Twitter feed analysis. Discover its data pipeline, utilizing GPT-4, and the transition towards cost-effective open-source models using OpenPipe. Explore the challenges faced during prompt iteration, with a focus on better prompt engineering tools, including the introduction of a TypeScript library, Prompt Iteration Assistant.    🔛 Masterclass: AI/LLM Tutorials⭐ Exploring Google's Gemini Pro Vision LLM with Javascript: A Practical Guide: The blog introduces the concept of multi-modal LLMs capable of interpreting various data modes, including images. Learn how to utilize Google's multi-modal Gemini Pro Vision LLM with Javascript. The tutorial guides you through creating an AI-powered nutrition-fact explainer app using the newly released LLM. The tutorial covers prerequisites, such as installing node.js and obtaining a Gemini LLM API key.  ⭐ Accelerating AI Application Productionization: A Guide with SageMaker JumpStart, Amazon Bedrock, and TruEra: The post emphasizes the importance of observability in LLM applications and provides insights into evaluating responses for honesty, harmlessness, and helpfulness. You'll learn how to deploy, fine-tune, and iterate on foundation models for LLM applications using Amazon SageMaker JumpStart, Amazon Bedrock, and TruEra.  ⭐ Quantizing LLMs with Activation-aware Weight Quantization (AWQ): Explore the application of Activation-aware Weight Quantization (AWQ) to democratize LLMs like Llama-2, making them more accessible for deployment on regular CPUs or less powerful GPUs. The process involves setting up a GPU instance, installing necessary packages like AutoAWQ and transformers, and saving the quantized model. The tutorial further covers the model upload to the Hugging Face Model Hub and concludes with the successful reduction of the Llama-2 model from ~27GB to ~4GB, enhancing its efficiency for diverse applications. ⭐ Unlocking Your MacBook's AI Potential: Running 70B LLM Models Without Quantization: Discover how to unleash the hidden AI power of your 8GB MacBook as this post explores the latest 2.8 version of AirLLM. Without the need for quantization or model compression, an ordinary MacBook can now efficiently run top-tier 70 billion parameter models. Explore the MacBook's AI capabilities, understanding Apple's role in AI evolution through its M1, M2, and M3 series GPUs, which offer competitive performance in the era of generative AI. Gain insights into GPU capabilities, memory advantages, and the open-source MLX platform. 🚀 HackHub: Trending AI Tools⭐ Giskard-AI/giskard: Specialized testing framework for ML models, covering a range from tabular to LLMs. Developers can efficiently scan AI models using just four lines of code. ⭐ CopilotKit/CopilotKit: Build in-app AI chatbots that seamlessly interact with the app state, execute actions within the app, and communicate with both frontend, backend, and third-party services via plugins, serving as an AI "second brain" for users. ⭐ chengzeyi/stable-fast:  Leverage stable-fast for efficient and high-performance inference on various diffuser models while enjoying fast model compilation and out-of-the-box support for dynamic shape, LoRA, and ControlNet. ⭐ ml-explore/mlx: Array framework for machine learning on Apple silicon by Apple's ML research, offering familiar Python and C++ APIs closely aligned with NumPy and PyTorch. 
Read more
  • 0
  • 0
  • 378

article-image-detecting-addressing-llm-hallucinations-in-finance
James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Save for later

Detecting & Addressing LLM 'Hallucinations' in Finance

James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, The Future of Finance with ChatGPT and Power BI, by James Bryant, Alok Mukherjee. Enhance decision-making, transform your market approach, and find investment opportunities by exploring AI, finance, and data visualization with ChatGPT's analytics and Power BI's visuals.IntroductionLLMs, such as OpenAI’s GPT series, can sometimes generate responses that are referred to as “hallucinations.” These are instances where the output from the model is factually incorrect, it presents information that it could not possibly know (given it doesn’t have access to real-time or personalized data), or it might output something nonsensical or highly improbable.Let’s explore deeper into what hallucinations are, how to identify them, and what steps can be taken to mitigate their impact, especially in a context where accurate and reliable information is crucial, such as financial analysis, trading, or visual data presentations.Understanding hallucinationsLet’s look at some examples:Factual inaccuracies: Suppose an LLM provides information stating that Apple Inc. was founded in 1985. This is a clear factual inaccuracy because Apple was founded in 1976.Speculative statements: If an LLM were to suggest that “As of 2023, Tesla’s share price has hit $3,000,” this is a hallucination. The model doesn’t know real-time data and any post-2021 prediction or speculation it makes about specific stock prices is unfounded.Confident misinformation: For instance, if an LLM confidently states that “Amazon has declared bankruptcy in late 2022,” this is a hallucination and can have serious consequences if it’s acted upon without verification.How can we spot hallucinations?Here are some useful ways to spot hallucinations:Cross-verification: If an LLM suggests an unusual trading strategy, such as shorting a typically stable blue-chip stock based on some supposed insider information, always cross-verify this advice with other reliable sources or consult a financial advisor.Questioning the source: If an LLM claims that “our internal data shows a bullish trend for cryptocurrency X,” this is likely a hallucination. The model doesn’t have access to proprietary internal data.Time awareness: If the model provides information or trends post-September 2021 without the user explicitly asking for a hypothetical or simulated scenario, consider this a red flag. For example, GPT-4 giving specific “real-time” market cap values for companies in 2023 would be a hallucination.What can we do about hallucinations?Here are some ideas:Promote awareness: If you are developing an AI-assisted trading app that uses an LLM, ensure users are aware of potential hallucinations, perhaps with a disclaimer or notification upon usageImplement checks: You might integrate a news API that could help validate major financial events or claims made by the modelMinimizing hallucinations in the futureThere are various ways we can minimize hallucinations. Here are some examples:Training improvements: Imagine developing a better model that understands context and sticks to the known data more closely, avoiding speculative or incorrect financial statements. Future versions of the model could be specifically trained on financial data, news, and reports to understand the context and semantics of financial trading and investment better. We could do this to ensure that it understands a short squeeze scenario accurately, or is aware that penny stocks typically come with higher risks.Better evaluation metrics: For instance, develop a specific metric that calculates the percentage of the model’s outputs that were flagged as hallucinations during testing. In the development phase, the models could be evaluated on more focused tasks such as generating valid trading strategies or predicting the impact of certain macroeconomic events on stock prices. The better the model performs on these tasks, the lower the chance of hallucinations occurring.Post-processing methods: Develop an algorithm that cross-references model outputs against reliable financial data sources and flags potential inaccuracies. After the model generates a potential trading strategy or investment suggestion, this output could be cross-verified using a rules-based system. For instance, if the model suggests shorting a stock that has consistently performed well without any recent negative news or poor earnings reports, the system might flag this as a potential hallucination.As an example, you can use libraries such as yfinance or pandas_datareader to access real-time or historical financial data:!pip install yfinance pandas_datareader import yfinance as yf def get_stock_data(ticker, start, end): stock = yf.Ticker(ticker) data = stock.history(start=start, end=end) return data # Example Usage: data = get_stock_data("AAPL", "2021-01-01", "2023-01-01")You could also develop a cross-verification algorithm and compare the model’s outputs with the collected financial data to flag potential inaccuracies.Integration with real-time data: While creating Power BI visualizations, data that’s been pulled from the LLM could be cross-verified with real-time data from financial databases or APIs. Any discrepancies, such as inconsistent market share percentages or revenue growth rates, could be flagged. This reduces the risk of presenting hallucinated data in visualizations. Let’s look at some examples: Extracting real-time data: You can continue to use yfinance or pandas_datareader to extract real-time data Cross-verifying with real-time data: You can compare the model’s output with real-time data to identify discrepancies:def real_time_cross_verify(output, real_time_data): # Assume output is a dict with keys 'market_share', 'revenue_ growth', and 'ticker' ticker = output['ticker'] # Fetch real-time data (assuming a function get_real_time_ data is defined) real_time_data = get_real_time_ data(ticker) # Compare the model's output with real-time data if abs(output['market_share'] - real_time_data['market_ share']) > 0.05 or \ abs(output['revenue_growth'] - real_time_data['revenue_ growth']) > 0.05: return True # Flagged as a potential hallucination return False # Not flagged # Example Usage: output = {'market_share': 0.25, 'revenue_growth': 0.08, 'ticker': 'AAPL'} real_time_data = {'market_share': 0.24, 'revenue_growth': 0.07, 'ticker': 'AAPL'} flagged = real_time_cross_verify(output, real_time_data)User feedback loop: A mechanism can be incorporated to allow users to report potential hallucinations. For instance, if a user spots an error in the LLM’s output during a Power BI data analysis session, they can report this. Over time, these reports can be used to further train the model and reduce hallucinations.OpenAI is on the caseTo tackle the chatbot’s missteps, OpenAI engineers are working on ways for its AI models to reward themselves for outputting correct data when moving toward an answer, instead of rewarding themselves only at the point of conclusion. The system could lead to better outcomes as it incorporates more of a human-like chain-of-thought procedure, according to the engineers.These examples should help in illustrating the concept and risks of LLM hallucinations, particularly in high-stakes contexts such as finance. As always, these models should be seen as powerful tools for assistance, but not as a final authority.Trading examplesHallucination scenario: Let’s assume you’ve asked an LLM for a prediction on the future performance of a specific stock, let’s say Tesla. The LLM might generate a response that appears confident and factual, such as “Based on the latest earnings report, Tesla has declared bankruptcy.” If you acted on this hallucinated information, you might rush to sell Tesla shares only to find out that Tesla is not bankrupt at all. This is an example of a potentially disastrous hallucination.Action: Before making any trading decision based on the LLM’s output, always cross-verify the information from a reliable financial news source or the company’s official communications.Power BI visualization examplesHallucination scenario: Suppose you’re using an LLM to generate text descriptions for a Power BI dashboard that tracks the market share of different automakers in the EV market. The LLM might hallucinate and produce a statement such as “Rivian has surpassed Tesla in terms of global EV market share.” This statement might be completely inaccurate as Tesla had a significantly larger market share than Rivian.Action: When using LLMs to generate text descriptions or insights for your Power BI dashboards, it’s crucial to cross-verify any assertions that are made by the model. You can do this by cross-referencing the underlying data in your Power BI dashboard or by referring to reliable external sources of information.To minimize hallucinations in the future, the model can be fine-tuned with a dataset that’s been specifically curated to cover the relevant domain. The use of a structured validation set can help spot and rectify hallucinations during the model training process. Also, employing a robust fact-checking mechanism on the output of the model before acting on its suggestions or insights can help catch and rectify any hallucinations.Remember, while LLMs can provide valuable insights and suggestions, their output should always be used as one of many inputs in your decision-making process, particularly in high-stakes environments such as financial trading and analysis.ConclusionIn the dynamic world of financial analysis and data visualization, the presence of LLM 'hallucinations' poses a challenge. Awareness, verification, and ongoing improvement strategies stand as pillars against these inaccuracies. While LLMs offer invaluable support, their outputs must be scrutinized, verified, and used as one among many tools in decision-making. As we navigate this landscape, vigilance, continuous refinement, and a critical eye will fortify our ability to harness the power of LLMs while mitigating the risks they present in high-stakes financial contexts.Author BioJames Bryant, a finance and technology expert, excels at identifying untapped opportunities and leveraging cutting-edge tools to optimize financial processes. With expertise in finance automation, risk management, investments, trading, and banking, he's known for staying ahead of trends and driving innovation in the financial industry. James has built corporate treasuries like Salesforce and transformed companies like Stanford Health Care through digital innovation. He is passionate about sharing his knowledge and empowering others to excel in finance. Outside of work, James enjoys skiing with his family in Lake Tahoe, running half marathons, and exploring new destinations and culinary experiences with his wife and daughter.Aloke Mukherjee is a seasoned technologist with over a decade of experience in business architecture, digital transformation, and solutions architecture. He excels at applying data-driven solutions to real-world problems and has proficiency in data analytics and planning. Aloke worked at EMC Corp and Genentech and currently spearheads the digital transformation of Finance Business Intelligence at Stanford Health Care. In addition to his work, Aloke is a Certified Personal Trainer and is passionate about helping his clients stay fit. Aloke also has a passion for wine and exploring new vineyards. 
Read more
  • 0
  • 0
  • 501

article-image-interactive-email-phishing-training-with-chatgpt
Clint Bodungen
03 Jan 2024
14 min read
Save for later

Interactive Email Phishing Training with ChatGPT

Clint Bodungen
03 Jan 2024
14 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, ChatGPT for Cybersecurity Cookbook, by Clint Bodungen. Master ChatGPT and the OpenAI API, and harness the power of cutting-edge generative AI and large language models to revolutionize the way you perform penetration testing, threat detection, and risk assessment.IntroductionWith the rise of cyber threats, organizations of all sizes are increasingly aware of the importance of training their staff on email phishing, a common and potentially dangerous tactic employed by cybercriminals. In this recipe, we'll be using ChatGPT to create a tool for interactive email phishing training. This recipe guides you through the process of crafting a specialized prompt to turn ChatGPT into a simulation tool for phishing attack awareness. With this approach, you can use ChatGPT to train users to identify potential phishing emails, thereby increasing their awareness and helping to protect your organization from potential security threats. What makes this truly powerful is its interactive nature. ChatGPT will present the user with a series of email scenarios. The user will then decide whether the email is a phishing attempt or a legitimate email, and can even ask for more details such as the URL to a link in the email or header information, for example. ChatGPT will provide feedback, ensuring a continuous, engaging, and efficient learning experience. Additionally, we will also cover how to use Python in conjunction with these prompts to create exportable email simulation scenarios. This feature can be beneficial in situations where you might want to use the generated scenarios outside of ChatGPT, such as in a live course or in a Learning Management System (LMS). Getting ready Before diving into this recipe, ensure you have your OpenAI account set up and your API key on hand. If not, you should refer back to Chapter 1 for the necessary setup details. You will also need Python version 3.10.x or later.  Additionally, confirm you have the following Python libraries installed: 1. openai: This library enables you to interact with the OpenAI API. Install it using the command pip install openai. 2. os: This is a built-in Python library, which allows you to interact with the operating system, especially for accessing environment variables. 3. tqdm: This library is utilized for showing progress bars during the policy generation process. Install it with pip install tqdm. How to do it… In this section, we will walk you through the process of creating an interactive email phishing training simulation using ChatGPT. The instructions are step-by-step, starting from logging into your OpenAI account and ending with generating phishing training simulations. 1. Access the ChatGPT interface. Log into your OpenAI account and go to the ChatGPT interface at https://chat.openai.com. 2.Initialize the simulation by entering the specialized prompt. The following prompt is carefully designed to instruct ChatGPT to act as a phishing training simulator. Enter the prompt into the text box and press Enter. "You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience. Create an interactive email phishing training simulation (for employees). Provide no other response other than to ask the question, "Is the following email real or a phishing attempt? (You may ask clarification questions such as URL information, header information, etc.)" followed by simulated email, using markdown language formatting. The email you present can represent a legitimate email or a phishing attempt, which can use one or more various techniques. Provide no further generation or response until I answer the question. If I answer correctly, just respond with "Correct" and a short description to further explain the answer, and then restart the process from the beginning. If I answer incorrectly, respond with "Incorrect", then the correct answer, then a short description to further explain the answer. Then repeat the process from the beginning.    Present me with only 3 simulations in total throughout the process and remember my answer to them all. At least one of the simulations should simulate a real email. After the last question has been answered, and after your response, end the assessment and give me my total score, the areas I did well in and where I need to improve." Tip Be sure to change the number of simulations ChatGPT provides, to suit your needs. Now, ChatGPT will generate interactive email phishing scenarios based on your instructions. Respond to each scenario as if you were the employee undergoing the training. After the third scenario and your final response, ChatGPT will calculate and provide your total score, areas of strength, and areas for improvement. How it works… At the heart of this recipe lies the specialized prompt. This prompt is constructed to instruct ChatGPT to act as an interactive phishing training tool, delivering a series of email phishing scenarios. The prompt follows certain design principles which are essential to its effectiveness and interaction with the OpenAI models. Here, we'll dissect those principles: 1. Defining the role: The prompt starts by setting up the role of the AI model – a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience. By defining the AI's persona, we direct the model to generate responses using the knowledge and expertise expected from such a role. 2. Detailed instructions and simulation: The instructions given in the prompt are meticulously detailed, and it is this precision that enables ChatGPT to create effective and realistic phishing simulations. The prompt asks the AI model to generate a phishing email scenario, followed by the question, "Is the following email real or a phishing attempt?". Notably, the AI model is given the liberty to provide additional clarifying questions, such as asking about URL information, header information, etc., giving it the freedom to generate more nuanced and complex scenarios. By asking the model to generate these emails using markdown language formatting, we ensure that the simulated emails have the structure and appearance of genuine emails, enhancing the realism of the simulation. The AI is also instructed to present emails that can represent either legitimate correspondence or a phishing attempt, ensuring a diverse range of scenarios for the user to evaluate. How can ChatGPT convincingly simulate phishing emails? Well, ChatGPT's strength comes from the wide variety of text it has been trained on, including (but not limited to) countless examples of email correspondences and probably some instances of phishing attempts or discussions around them. From this extensive training, the model has developed a robust understanding of the format, tone, and common phrases used in both legitimate and phishing emails. So, when prompted to simulate a phishing email, it can draw on this knowledge to generate a believable email that mirrors the features of a real-world phishing attempt. As the model doesn't generate responses until it receives an answer to its question, it guarantees an interactive user experience. Based on the user's response, the model provides relevant feedback ("Correct" or "Incorrect"), the correct answer if the user was wrong, and a brief explanation. This detailed, immediate feedback aids the learning process and helps to embed the knowledge gained from each simulated scenario. It's worth noting that, although the model has been trained to generate human-like text, it doesn't understand the content in the same way humans do. It doesn't have beliefs, opinions, or access to real-time, world-specific information or personal data unless explicitly provided in the conversation. Its responses are merely predictions based on its training data. The carefully designed prompt and structure are what guide the model to generate useful, contextually appropriate content for this particular task. 3. Feedback mechanism: The prompt instructs the AI to provide feedback based on the user's answer, further explaining the answer. This creates an iterative feedback loop that enhances the learning experience. 4. Keeping track of progress: The prompt instructs the AI to present three simulations in total and to remember the user's answer to all of them. This ensures continuity in the training and enables tracking of the user's progress. 5. Scoring and areas of improvement: After the final simulation and response, the prompt instructs the AI to end the assessment and provide a total score along with areas of strength and areas for improvement. This helps the user understand their proficiency and the areas they need to focus on for improvement. ChatGPT’s models are trained on a broad range of internet text. However, it's essential to note that it does not know specifics about which documents were part of its training set or have access to any private, confidential, or proprietary information. It generates responses to prompts by recognizing patterns and producing text that statistically aligns with the patterns it observed in its training data. By structuring our prompt in a way that clearly defines the interactive assessment context and expected behavior, we're able to leverage this pattern recognition to create a highly specialized interactive tool. The ability of the OpenAI models to handle such a complex and interactive use case demonstrates their powerful capability and flexibility. There’s more… If you're using a Learning Management System (LMS) or conducting a live class, you might prefer to have a list of scenarios and details rather than an interactive method like ChatGPT. In these settings, it's often more practical to provide learners with specific scenarios to ponder and discuss in a group setting. The list can also be used for assessments or training materials, offering a static reference point that learners can revisit as needed, or as content for a phishing simulation system. By modifying the script from the previous recipe, you can instruct the ChatGPT model to produce a set of phishing email simulations complete with all necessary details. The resulting text can be saved into a file for easy distribution and usage in your training environment. Since this script is so similar to the one from the previous recipe, we’ll just cover the modifications instead of steppping through the entire script again. Let's walk through the necessary modifications: 1.      Rename and modify the function: The function generate_question is renamed to generate_email_simulations, and its argument list and body are updated to reflect its new purpose. It will now generate the phishing email simulations instead of cybersecurity awareness questions. This is done by updating the messages that are passed to the OpenAI API within this function. def generate_email_simulations() -> str:    # Define the conversation messages    messages = [        {"role": "system", "content": 'You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience.'},        {"role": "user", "content": 'Create a list of fictitious emails for an interactive email phishing training. The emails can represent a legitimate email or a phishing attempt, using one or more various techniques. After each email, provide the answer, contextual descriptions, and details for any other relevant information such as the URL for any links in the email, header information. Generate all necessary information in the email and supporting details. Present 3 simulations in total. At least one of the simulations should simulate a real email.'},    ]    ... Note You can adjust the number of scenarios here to suit your needs. In this example, we've requested 3 scenarios. 2. Remove unnecessary code: The script no longer reads content categories from an input file, as it's not required in your use case. 3. Update variable and function names: All variable and function names referring to "questions" or "assessment" have been renamed to refer to "email simulations" instead, to make the script more understandable in the context of its new purpose. 4. Call the appropriate function: The generate_email_simulations function is called instead of the generate_question function. This function initiates the process of generating the email simulations. # Generate the email simulations email_simulations = generate_email_simulations() Tip Like the previous method, more scenarios will require a model that supports a larger context window. However, the gpt-4 model seems to provide better results in terms of accuracy, depth, and consistency with its generations for this recipe. Here’s how the complete script should look: import openai import os import threading import time from datetime import datetime # Set up the OpenAI API openai.api_key = os.getenv("OPENAI_API_KEY") current_datetime = datetime.now().strftime('%Y-%m-%d_%H-%M-%S') assessment_name = f"Email_Simulations_{current_datetime}.txt" def generate_email_simulations() -> str:    # Define the conversation messages    messages = [        {"role": "system", "content": 'You are a cybersecurity professional and expert in adversarial social engineering tactics, techniques, and procedures, with 25 years of experience.'},        {"role": "user", "content": 'Create a list of fictitious emails for an interactive email phishing training. The emails can represent a legitimate email or a phishing attempt, using one or more various techniques. After each email, provide the answer, contextual descriptions, and details for any other relevant information such as the URL for any links in the email, header information. Generate all necessary information in the email and supporting details. Present 3 simulations in total. At least one of the simulations should simulate a real email.'},    ]    # Call the OpenAI API    response = openai.ChatCompletion.create(        model="gpt-3.5-turbo",        messages=messages,        max_tokens=2048,        n=1,        stop=None,        temperature=0.7,    )    # Return the generated text    return response['choices'][0]['message']['content'].strip() # Function to display elapsed time while waiting for the API call def display_elapsed_time():    start_time = time.time()    while not api_call_completed:        elapsed_time = time.time() - start_time        print(f"\rElapsed time: {elapsed_time:.2f} seconds", end="")        time.sleep(1) api_call_completed = False elapsed_time_thread = threading.Thread(target=display_elapsed_time) elapsed_time_thread.start() # Generate the report using the OpenAI API try:    # Generate the email simulations    email_simulations = generate_email_simulations() except Exception as e:    print(f"\nAn error occurred during the API call: {e}")    exit() api_call_completed = True elapsed_time_thread.join() # Save the email simulations into a text file try:    with open(assessment_name, 'w') as file:        file.write(email_simulations)    print("\nEmail simulations generated successfully!") except Exception as e:    print(f"\nAn error occurred during the email simulations generation: {e}") By running this modified script, the ChatGPT model is directed to generate a series of interactive email phishing training scenarios. The script then collects the generated scenarios, checks them for errors, and writes them to a text file. This gives you a ready-made training resource that you can distribute to your learners or incorporate into your LMS or live training sessions. ConclusionIn conclusion, leveraging ChatGPT for interactive email phishing training empowers users with immersive, realistic simulations, bolstering cybersecurity awareness and defense. This innovative approach fosters a proactive stance against threats, ensuring organizations stay ahead in the ever-evolving landscape of cyber risks. With adaptable learning and dynamic feedback, ChatGPT transforms education, creating a robust line of defense against potential security breaches.Author BioClint Bodungen is a cybersecurity professional with 25+ years of experience and the author of Hacking Exposed: Industrial Control Systems. He began his career in the United States Air Force and has since many of the world's largest energy companies and organizations, working for notable cybersecurity companies such as Symantec, Kaspersky Lab, and Booz Allen Hamilton. He has published multiple articles, technical papers, and training courses on cybersecurity and aims to revolutionize cybersecurity education using computer gaming (“gamification”) and AI technology. His flagship product, ThreatGEN® Red vs. Blue, is the world’s first online multiplayer cybersecurity simulation game, designed to teach real-world cybersecurity.
Read more
  • 0
  • 0
  • 199

article-image-databricks-dolly-for-future-ai-adoption
Sagar Lad
29 Dec 2023
6 min read
Save for later

Databricks Dolly for Future AI Adoption

Sagar Lad
29 Dec 2023
6 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionArtificial intelligence is playing an increasingly crucial role in assisting businesses and organizations to process huge volumes of data that the world is producing. The development of huge language models to evaluate enormous amounts of text data is one of the largest challenges in AI research. Databricks Dolly revolutionized the Databricks project, opening the door for more complex NLP models and improving the field of AI technology.Databricks Dolly for AIBefore we deep dive into Databricks Dolly and its impact on the future of AI adoption, let’s understand the basics of Large Language Models and their current challenges.Large Language Models & Databricks DollyAn artificial intelligence system called a large language model is used to produce human-like language and comprehend natural language processing activities. These models are created using deep learning methods and are trained on a lot of text input using a neural network design. Its major objective is to produce meaningful and coherent text from a given prompt or input. There are many uses for this, including speech recognition, chatbots, language translation, etc.They have gained significant popularity because of below capabilities :Text GenerationLanguage TranslationClassification and Categorization Conversational AIRecently ChapGPT from OpenAI, Google Bard, and Bing have created unified models for training and fine-tuning such models at a large scale. Now the issue with these LLMs is that they save user data on external servers, opening the cloud to unauthorized users and increasing the risk of sensitive data being compromised. Additionally, They may provide irrelevant information that could potentially injure users and lead to poor judgments, as well as offensive, discriminating content against certain individuals.In order to overcome this challenge, there is a need for open-source alternatives that promote the accuracy, and security of Large Language Models. The Databricks team has built Databricks Dolly, an open-source chatbot that adheres to these criteria and performs exceptionally in a variety of use situations, in response to these requirements after carefully examining user issues.Databricks Dolly can produce text by responding to questions, summarising ideas, and other natural language commands. It is built on an open-source, 6-billion-parameter model from EleutherAI that has been modified using the databricks-dolly-15k dataset of user-generated instructions. Due to Dolly's open-source nature and commercial licensing, anyone can use it to build interactive applications without having to pay for API access or divulge their data to outside parties. Dolly may be trained for less than $30, making construction costs low. Data can be saved in the DBFS root or another cloud object storage location that we specify when Dolly generates an answer. Using Dolly, we can design, construct, and personalize LLM without sharing any data.                                                         Image 1 - Databricks Dolly DifferentiatorsDemocratizing the magic of Databricks DollyWith Databricks Dolly , we can manage the below types of engagements.1.  Open & Close ended Question and Answers2.  Information Parsing from web3.  Detailed Answers based on the input4.  Creative Writing Now, Let’s see in detail how we can use Databricks dolly.Step 1 : Install Required LibrariesUse the below command in Databricks notebook or use cmd to install the required packages.%pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"                                                                                   Image 2 - Databricks Dolly Package InstallationAs you can see from the image, once we execute this command in Databricks, the required packages are installed.Accelerate : Accelerate the training of machine learning modelsTransformers : Collection of pre-trained models for NLP activitiesTorch : To build and train deep learning modelsStep 2 : Input to the Databricks DollyOnce the model is loaded, the next step is to generate text based on the generate_next function.                                                                                                                     Image 3 - Databricks Dolly - Create Pipeline for remote code executionHere, the pipeline function from the Transformers library is used to execute the NLP tasks such as text generation, sentiment analysis, etc. Option trust_remote_code is used for the remote code execution.Step 3 : Pipeline reference to parse the output                                                                   Image 4 -Databricks Dolly - Create a Pipeline for remote code executionNow, the final step is to provide the textual input to the model using the generate_text function to which will use the language model to generate the response.Best Practices of Using Databricks DollyBe specific and lucid in your instructions to DollyUse Databricks Machine Learning Models to train and deploy Dolly for a scalable and faster executionUse the hugging face library and repo which has multiple tutorials and examplesConclusionThis article describes the difficulties that organizations have adopting Large Language Models and how Databricks may overcome these difficulties by utilising Dolly. Dolly gives businesses the ability to create a customized LLM that meets their unique requirements and has the added benefit of having open-source source code. In order to maximize LLM performance, the article also highlights the significance of recommended practices.Author Bio:Sagar Lad is a Cloud Data Solution Architect with a leading organization and has deep expertise in designing and building Enterprise-grade Intelligent Azure Data and Analytics Solutions. He is a published author, content writer, Microsoft Certified Trainer, and C# Corner MVP.Link - Medium , Amazon , LinkedIn
Read more
  • 0
  • 0
  • 242

article-image-creating-vulnerability-assessment-plans-with-chatgpt
Clint Bodungen
26 Dec 2023
9 min read
Save for later

Creating Vulnerability Assessment Plans with ChatGPT

Clint Bodungen
26 Dec 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, ChatGPT for Cybersecurity Cookbook, by Clint Bodungen. Master ChatGPT and the OpenAI API, and harness the power of cutting-edge generative AI and large language models to revolutionize the way you perform penetration testing, threat detection, and risk assessment.IntroductionIn this recipe, you'll learn how to harness the power of ChatGPT and the OpenAI API to create comprehensive vulnerability assessment plans using network, system, and business details as input. This recipe is invaluable for both cybersecurity students and beginners looking to familiarize themselves with proper methods and tools for vulnerability assessments, as well as experienced cybersecurity professionals aiming to save time on planning and documentation. Building upon the skills acquired in Chapter 1, you will delve deeper into establishing the system role of a cybersecurity professional specializing in vulnerability assessments. You'll learn how to craft effective prompts that generate well-formatted output using markdown language. This recipe will also expand on the techniques explored in Enhancing Output with Templates (Application: Threat Report) and Formatting Output as a Table (Application: Security Controls Table) in Chapter 1, enabling you to design prompts that produce the desired output format. Finally, you'll discover how to use the OpenAI API and Python to generate the vulnerability assessment plan, and then export it as a Microsoft Word file. This recipe will serve as a practical guide for creating detailed and efficient vulnerability assessment plans using ChatGPT and the OpenAI API. Getting Ready Before diving into the recipe, you should already have your OpenAI account setup and have obtained your API key. If not, revisit Chapter 1 for details. You will also need to be sure you have the following Python libraries installed: 1. python-docx: This library will be used to generate Microsoft Word files. You can install it using the command pip install python-docx.  2. tqdm: This library will be used to display progress bars. You can install it using the command:pip install tqdmHow to do it… In this section, we will walk you through the process of using ChatGPT to create a comprehensive vulnerability assessment plan tailored to a specific network and organization's needs. By providing the necessary details and using the given system role and prompt, you will be able to generate a well-structured assessment plan. 1. Begin by logging in to your ChatGPT account and navigating to the ChatGPT web UI. 2. Start a new conversation with ChatGPT by clicking the "New chat" button. 3.  Enter the following prompt to establish a system role: You are a cybersecurity professional specializing in vulnerability assessment. 4. Enter the following message text, but replace the placeholders in the “{ }” brackets with the appropriate data of your choice. You can either combine this prompt with the system role or enter it separately as follows: Using cybersecurity industry standards and best practices, create a complete and detailed assessment plan (not a penetration test) that includes: Introduction, outline of the process/methodology, tools needed, and a very detailed multi-layered outline of the steps. Provide a thorough and descriptive introduction and as much detail and description as possible throughout the plan. The plan should not only assessment of technical vulnerabilities on systems but also policies, procedures, and compliance. It should include the use of scanning tools as well as configuration review, staff interviews, and site walk-around. All recommendations should following industry standard best practices and methods. The plan should be a minimum of 1500 words. Create the plan so that it is specific for the following details: Network Size: {Large} Number of Nodes: {1000} Type of Devices: {Desktops, Laptops, Printers, Routers} Specific systems or devices that need to be excluded from the assessment: {None} Operating Systems: {Windows 10, MacOS, Linux} Network Topology: {Star} Access Controls: {Role-based access control} Previous Security Incidents: {3 incidents in the last year} Compliance Requirements: {HIPAA} Business Critical Assets: {Financial data, Personal health information} Data Classification: {Highly confidential} Goals and objectives of the vulnerability assessment: {To identify and prioritize potential vulnerabilities in the network and provide recommendations for remediation and risk mitigation.} Timeline for the vulnerability assessment: {4 weeks{ Team: {3 cybersecurity professionals, including a vulnerability assessment lead and two security analysts} Expected deliverables of the assessment: {A detailed report outlining the results of the vulnerability assessment, including identified vulnerabilities, their criticality, potential impact on the network, and recommendations for remediation and risk mitigation.} Audience: {The organization's IT department, senior management, and any external auditors or regulators.} Provide the plan using the following format and markdown language: #Vulnerability Assessment Plan ##Introduction Thorough Introduction to the plan including the scope, reasons for doing it, goals and objectives, and summary of the plan ##Process/Methodology Description and Outline of the process/Methodology ##Tools Required List of required tools and applications, with their descriptions and reasons needed ##Assessment Steps Detailed, multi-layered outline of the assessment steps Hint If you are performing this in the OpenAI Playground, it is advisable to use Chat mode and enter the role in the System window, and the prompt in the User message window. Figure 2.1 shows the system role and user prompt entered into the OpenAI Playground.  Figure 2.1 – OpenAI Playground Method 5. Review the generated output from ChatGPT. If the output is satisfactory and meets the requirements, you can proceed to the next step. If not, you can either refine your prompt or re-run the conversation to generate a new output. 6. Once you have obtained the desired output, you can use the generated markdown to create a well-structured vulnerability assessment plan in your preferred text editor or markdown viewer.  Figure 2.2 shows an example of ChatGPT generation of a vulnerability assessment plan using markdown language formatting.  Figure 2.2 – Example ChatGPT Assessment Plan Output How it works… This GPT-assisted vulnerability assessment plan recipe leverages the sophistication of natural language processing (NLP) and machine learning algorithms to generate a comprehensive and detailed vulnerability assessment plan. By adopting a specific system role and an elaborate user request as a prompt, ChatGPT is able to customize its response to meet the requirements of a seasoned cybersecurity professional who is tasked with assessing an extensive network system. Here's a closer look at how this process works: 1. System Role and Detailed Prompt: The system role designates ChatGPT as a seasoned cybersecurity professional specializing in vulnerability assessment. The prompt, which serves as the user request, is detailed and outlines the specifics of the assessment plan, from the size of the network and types of devices to the required compliance and the expected deliverables. These inputs provide context and guide ChatGPT's response, ensuring it is tailored to the complexities and requirements of the vulnerability assessment task. 2. Natural Language Processing and Machine Learning: NLP and machine learning form the bedrock of ChatGPT's capabilities. It applies these technologies to understand the intricacies of the user request, learn from the patterns, and generate a well-structured vulnerability assessment plan that is detailed, specific, and actionable. 3. Knowledge and Language Understanding Capabilities: ChatGPT uses its extensive knowledge base and language understanding capabilities to conform to industry-standard methodologies and best practices. This is particularly important in the rapidly evolving field of cybersecurity, ensuring that the resulting vulnerability assessment plan is up-to-date and adheres to recognized standards. 4. Markdown Language Output: The use of markdown language output ensures that the plan is formatted in a consistent and easy-to-read manner. This format can be easily integrated into reports, presentations, and other formal documents, which is crucial when communicating the plan to IT departments, senior management, and external auditors or regulators. 5. Streamlining the Assessment Planning Process: The overall advantage of using this GPT-assisted vulnerability assessment plan recipe is that it streamlines the process of creating a comprehensive vulnerability assessment plan. You save time on planning and documentation and can generate a professional-grade assessment plan that aligns with industry standards and is tailored to the specific needs of your organization. By applying these detailed inputs, you transform ChatGPT into a potential tool that can assist in creating a comprehensive, tailored vulnerability assessment plan. This not only bolsters your cybersecurity efforts but also ensures your resources are utilized effectively in protecting your network systems. ConclusionIn harnessing ChatGPT and the OpenAI API, this guide unlocks a streamlined approach to crafting detailed vulnerability assessment plans. Whether a novice or seasoned cybersecurity professional, leveraging these tools optimizes planning and documentation. By tailoring assessments to specific network intricacies, it fosters precision in identifying potential threats and fortifying defenses. Embrace this method to not only save time but also ensure comprehensive security measures aligned with industry standards, safeguarding networks effectively.Author BioClint Bodungen is a cybersecurity professional with 25+ years of experience and the author of Hacking Exposed: Industrial Control Systems. He began his career in the United States Air Force and has since many of the world's largest energy companies and organizations, working for notable cybersecurity companies such as Symantec, Kaspersky Lab, and Booz Allen Hamilton. He has published multiple articles, technical papers, and training courses on cybersecurity and aims to revolutionize cybersecurity education using computer gaming (“gamification”) and AI technology. His flagship product, ThreatGEN® Red vs. Blue, is the world’s first online multiplayer cybersecurity simulation game, designed to teach real-world cybersecurity.
Read more
  • 0
  • 0
  • 373

article-image-improving-ai-context-with-rag-using-azure-machine-learning-prompt-flow
Ryan Goodman
18 Dec 2023
11 min read
Save for later

Improving AI Context with RAG Using Azure Machine Learning prompt flow

Ryan Goodman
18 Dec 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionRetrieval Augmented Generation, or RAG, is a method that can expand the breadth or depth of information for Large Language Models (LLM). Retrieving and delivering more data to an LLM will result in applications with more contextual, relevant information. Unlike traditional web or mobile applications designed to retrieve structured data, RAG (Retrieval-Augmented Generation) requires data to be structured and indexed, and stored differently, most commonly in a vector database. The resulting experience should provide more contextual information with the added ability to cite the source of information. This narrower scope of information can result in a higher degree of accuracy and utility for your enterprise. To summarize RAG:Retrieval: When a question or formulated to an LLM-powered chat bot, RAG reviews the index to find relevant facts. This is like searching through an index where all the information is neatly summarized for quick access.Augmentation: It then takes these facts and feeds them to the language model, essentially giving it a brief on the subject matter at hand.Generation: With this briefing, the language model is now ready to craft a response that's not just based on what it already knows but also on the latest information it has just pulled in. In short, RAG keeps language models up-to-date and relevant, providing answers that are informed by the latest available data. It's a practical way to ensure AI remains accurate and useful for your organization, especially when dealing with current and evolving topics that may not be public knowledge. In this article, we will explore the differences between RAG and Fine tuning and how you can organize your RAG solution using Azure Machine Learning prompt flow. Retrieval Augmented Generation (RAG) vs Fine TuningWhen working with large language models through chat interfaces like ChatGPT, you will see that its foundational knowledge is point-in-time data. Fine-tuning, on the other hand, is akin to customizing the model with a new layer of knowledge that reflects your specific data, which becomes part of the model's intelligence. As of this article, OpenAI released GPT 4 Turbo which is based on a data set through April 2023. Extending an LLM’s body of knowledge can involve fine-tuning or RAG.Fine Tuning Foundational ModelsFine-tuning involves training a foundational model on a dataset specific to your application, effectively customizing the model to perform better for certain tasks or styles. To fine-tune a model, you need to have a dataset that represents the task or style you are aiming for and the computational resources to perform the training.Once fine-tuned, the model's knowledge is enhanced, and these changes are permanent unless the model is fine-tuned again with additional data. Fine-tuning is ideal for tasks needing deep customization where the information may be specialized but require re-training infrequently. While OpenAI has started to offer fine-tuning for certain models like GPT-3.5, not all foundational models or versions can be fine-tuned due to access restrictions to their parameters and training regimes.Retrieval Augmented GenerationRAG is like adding a live feed of information to the foundational model, enabling it to respond with the latest data without modifying the model itself. It involves augmenting a foundational language model’s response by dynamically integrating information retrieved from an external database, typically a vector database, at the time of the query.No Model Training Required: The foundational model's core parameters remain unchanged. Instead, RAG serves as a real-time data layer that the model queries to inform its responses.Real-time and Up to Date: Because RAG queries external data sources in real-time, it ensures that the language model's responses are enhanced by the most current and relevant information available.Image Credit: https://medium.com/@minh.hoque/retrieval-augmented-generation-grounding-ai-responses-in-factual-data-b7855c059322 RAG is widely adopted as the best starting point due to the following factors:Data Dynamics: Choose RAG for frequently changing data and fine-tuning for static, specialized domains. Like any data wrangling and model training problem, the results are only as good as your data quality.Resource Availability: With RAG you do not need expansive computational resources and budget like fine-tuning. You will still need skilled resources to implement and test RAG.Flexibility and Scalability: RAG offers adaptability to continuously add current information and ease of maintenance.Approaching RAG with Azure Machine Learning Prompt FlowWith a solid foundation of RAG vs fine-tuning, we will dive into the details of an approach to Retrieval Augmented Generation within Azure. Azure provides multiple solutions for creating and accessing vector indexes per Microsoft’s latest documentation. Azure offers 3 methods currently: Azure AI Studio, use a vector index and retrieval augmentation.Azure OpenAI Studio, use a search index with or without vectors.Azure Machine Learning, use a search index as a vector store in a prompt flow.Azure's approach to RAG lets you tailor the model to your business needs and integrate in private or public facing applications. What remains consistent is the ability to prepare and feed your data into the LLM of your choice. Within Azure Machine Learning Prompt Flow, Microsoft includes a number of practical features including a fact-checking layer alongside the existing model to ensure accuracy. Additionally, you can feed supplementary data directly to your large language models as prompts, enriching their responses with up-to-date and relevant information. Azure Machine Learning simplifies the process to augment your AI powered app with the latest data without the time and financial burdens often associated with comprehensive model retraining.  A benefit of using these services is the scalability and security and compliance functions that are native to Azure. A standard feature of Azure Machine Learning for ML models or LLMs is a point and click flow or notebook code interface to build your AI pipelines1.  Data Acquisition and Preparation with Azure Services for Immediate LLM Access:Azure Blob Storage for Data Storage is perfect for staging your data. These files can be anything from text files to PDFs.2. Vectorization and Indexing of your Data Using AI Studio and Azure AI SearchThis is a step that can be completed using one of multiple approaches including both open source and Azure native. Azure AI Studio significantly simplifies the creation and integration of a vector index for Retrieval-Augmented Generation (RAG) applications. Here are the main steps in the process:Initialization: Users start by selecting their data sources in Azure AI Studio, choosing from blob storage for easier testing and local file uploads.Index Creation: The platform guides users through configuring search settings and choosing an index storage location, with a focus on ease of use and minimal need for manual coding.This is one of many examples of how Azure AI Studio is democratizing the use of advanced RAG applications by merging and integrating different services in the Azure cloud together.                                                                                    SOURCE: Microsoft3. Constructing RAG Pipelines with Azure Machine Learning:Simplified RAG Pipeline Creation: With your index created, you can integrate it along with AI search as a plug-and-play component into your Prompt flow. With no/low code interface, you can drag and drop components to create your RAG pipeline.                                                                               Image Source: Microsoft.com Customization with Jupyter Notebooks: For those who are comfortable coding in Jupyter notebooks, Azure ML offers the flexibility to utilize Jupyter Notebooks natively. This will provide more control over the RAG pipeline to fit your project's unique needs. Additionally, there are other alternative flows that you can construct using libraries like LangChain as an alternative to using the Azure services.3. Manage AI Pipeline OperationsAzure Machine Learning provides a foundation designed for iterative and continuous updates. The full lifecycle for model deployment includes test data generation and prompt evaluation. ML and AI operations are needed to understand certain adjustments. For organizations already running Azure ML, prompt flow fits nicely into broader machine learning operations.Integrating RAG workflow into MLOPs pipeline through codes from azureml.core import Experiment from azureml.pipeline.core import Pipeline from azureml.pipeline.steps import PythonScriptStep # Create an Azure Machine Learning experiment experiment_name = 'rag_experiment' experiment = Experiment(ws, experiment_name) # Define a PythonScriptStep for RAG workflow integration rag_step = PythonScriptStep(name='RAG Step',                            script_name='rag_workflow.py',                            compute_target='your_compute_target',                            source_directory='your_source_directory',                            inputs=[rag_dataset.as_named_input('rag_data')],                            outputs=[],                            arguments=['--input_data', rag_dataset],                            allow_reuse=True) # Create an Azure Machine Learning pipeline with the RAG step rag_pipeline = Pipeline(workspace=ws, steps=[rag_step]) # Run the pipeline as an experiment pipeline_run = experiment.submit(rag_pipeline) pipeline_run.wait_for_completion(show_output=True) Here is the code snippets to create and manage data using Azure. from azureml.core import Dataset # Assuming you have a dataset named 'rag_dataset' in your Azure Machine Learning workspace rag_dataset = Dataset.get_by_name(ws, 'rag_dataset') # Split the dataset into training and testing sets train_data, test_data = rag_dataset.random_split(percentage=0.8, seed=42) # Convert the datasets to pandas DataFrames for easy manipulation train_df = train_data.to_pandas_dataframe() test_df = test_data.to_pandas_dataframe() ConclusionIt is important to note that the world of AI and LLMs is evolving at a rapid pace where months make a difference.  Azure Machine Learning for Retrieval Augmented Generation offers a transformative approach to leveraging Large Language Models and provides a compelling solution for enterprises that already have a competency center. Azure ML machine learning pipelines for data ingestion, robust training, management, and deployment capabilities for RAG is lowering the barrier for dynamic data integration with LLMs like OpenAI. As adoption continues to grow, we will see lots of exciting new use cases and success stories coming from organizations that adopt early and iterate fast. The benefit of Microsoft Azure is a single, managed and supported suite of services some of which already may be deployed within your organization. Azure services to support new AI adoption demands, Retrieval Augmented Generation included! Author BioRyan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.
Read more
  • 0
  • 0
  • 614
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-ai-distilled-28-gen-ai-reshaping-industries-redefining-possibilities
Merlyn Shelley
15 Dec 2023
12 min read
Save for later

AI_Distilled #28: Gen AI - Reshaping Industries, Redefining Possibilities

Merlyn Shelley
15 Dec 2023
12 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Once in a while, technology comes along that is so powerful and so broadly applicable that it accelerates the normal march of economic progress. And like a lot of economists, I believe that generative AI belongs in that category.” - Andrew McAfee, Principal Research Scientist, MIT Sloan School of Management This vividly showcases the kaleidoscope of possibilities Gen AI unlocks as it emerges from its cocoon, orchestrating a transformative symphony across realms from medical science to office productivity. Take Google’s newly released AlphaCode 2, for example, which achieves human-level proficiency in programming, or Meta’s AudioBox, which pioneers next-generation audio production. Welcome to AI_Distilled #30, your ultimate guide to the latest advancements in AI, ML, NLP, and Gen AI. This week's highlights include: 📚 Unlocking the Secrets of Geospatial Data: Dive into Bonny P. McClain's new book, "Geospatial Analysis with SQL," and master the art of manipulating data across diverse geographical landscapes. Learn foundational concepts and explore advanced spatial algorithms for a transformative journey. 🌍 Let's shift our focus to the most recent updates and advancements in the AI industry: Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability Europe Secures Deal on World's First Comprehensive AI Rules Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads Elon Musk's xAI Chatbot Launches on X We’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials: A Primer on Enhancing Output Accuracy Using Multiple LLMs Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance Navigating Responsible AI: A Comprehensive Guide to Impact Assessment Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework Last but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse Fine-Tuning LLMs: A Comprehensive Guide Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart Mitigating Hallucinations with Retrieval Augmented Generation What’s more, we’ve also shortlisted the best GitHub repositories you should consider for inspiration: bricks-cloud/BricksLLM kwaikeg/kwaiagents facebookresearch/Pearl andvg3/LSDM Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQ: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF. Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  A quick heads-up: Our team is taking a well-deserved holiday break to recharge and return with fresh ideas. So, there'll be a pause in our weekly updates for the next two weeks. We're excited to reconnect with you in the new year, brimming with new insights and creativity. Wishing you a fantastic holiday season! See you in 2024! Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis⭐ Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers: Microsoft is partnering with the American Federation of Labor and Congress of Industrial Organizations, a coalition of 60 labor unions representing 12.5 million workers. They plan to discuss AI's impact on jobs, offer AI training to workers, and encourage unionization with "neutrality" terms. The goal is to improve worker collaboration, influence AI development, and shape policies for frontline workers' tech skills. ⭐ Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability: Meta researchers have unveiled Audiobox, an advanced audio generation model addressing limitations in existing models. It prioritizes controllability, enabling unique styles via text descriptions and precise management of audio elements. Audiobox excels in speech and sound generation, achieving impressive benchmarks like 0.745 similarity on Librispeech for text-to-speech and 0.77 FAD on AudioCaps for text-to-sound using description and example-based prompts. ⭐ Europe Secures Deal on World's First Comprehensive AI Rules: EU negotiators have achieved a historic agreement on the first-ever comprehensive AI rules, known as the Artificial Intelligence Act. It addresses key issues, such as generative AI and facial recognition by law enforcement, aiming to establish clear regulations for AI while facing criticism for potential exemptions and loopholes. ⭐ Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration: Google DeepMind has unveiled AlphaCode 2, a successor to its groundbreaking AI that writes code at a human level. It outperforms 85% of participants in 12 recent Codeforces contests, aiming to collaborate effectively with human coders and promote AI-human collaboration in programming, aiding problem-solving and suggesting code designs. ⭐ Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices: Stable LM Zephyr 3B is a 3 billion parameter lightweight language model optimized for edge devices. It excels in text generation, especially instruction following and Q&A, surpassing larger models in linguistic accuracy. It's ideal for copywriting, summarization, and content personalization on resource-constrained devices, with a non-commercial license. ⭐ Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI: Purple Llama is an initiative promoting trust and safety in generative AI. It provides tools like CyberSec Eval for cybersecurity benchmarking and Llama Guard for input/output filtering. Components are permissively licensed to encourage collaboration and standardization in AI safety tools. ⭐ Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads: Google Cloud has launched the powerful Cloud TPU v5p AI accelerator, addressing the needs of large generative AI models with 2X more FLOPS and 3X HBM. It trains models 2.8X faster than TPU v4 and is 4X more scalable. Google also introduced the AI Hypercomputer, an efficient supercomputer architecture for AI workloads, aiming to boost innovation in AI for enterprises and developers. ⭐ Elon Musk's xAI Chatbot Launches on X: Grok, created by xAI, debuts on X (formerly Twitter) for $16/month to Premium Plus subscribers. It offers conversational answers, similar to ChatGPT and Google's Bard. Grok-1 incorporates real-time X data, providing up-to-the-minute information. Elon Musk praises Grok's rebellious personality, though its intelligence remains comparable to other chatbots. Currently text-only, xAI intends to expand Grok's capabilities to include video, audio, and more.  🔮 Expert Insights from Packt Community Geospatial Analysis with SQL - By Bonny P McClain Embark on a captivating journey into geospatial analysis, a field beyond geography enthusiasts! This book reveals how combining geospatial magic with SQL can tackle real-world challenges. Learn to create spatial databases, use SQL queries, and incorporate PostGIS and QGIS into your toolkit. Key Concepts: 🌍 Foundations:    - Understand the importance of geospatial analysis.    - See how location info enhances data exploration. 🗺️ Tobler's Wisdom:    - Embrace Walter Tobler's second law of geography.    - Explore how external factors impact the area of interest. 🔍 SQL Spatial Data Science:    - Master geospatial analysis with SQL.    - Build databases, write queries, and use handy functions. 🛠️ Toolbox Upgrade:    - Boost skills with PostGIS and QGIS.    - Handle data questions and excel in spatial analysis. Decode geospatial secrets—perfect for analysts and devs seeking location-based insights! Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ A Primer on Enhancing Output Accuracy Using Multiple LLMs: Explore using chain-of-thought prompts with LLMs like GPT-4 and PaLM2 for varied responses. Learn the "majority-vote/quorum" technique to enhance accuracy by combining responses from different LLMs using AIConfig for streamlined coordination, improving output reliability and minimizing errors. ⭐ Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance: The authors explore innovative prompting techniques to improve the performance of GPT-4 and similar models, introducing "Medprompt" and related methods. They achieve a 90.10% accuracy on the MMLU challenge with "Medprompt+," sharing code on GitHub for replication and LLM optimization. ⭐ Navigating Responsible AI: A Comprehensive Guide to Impact Assessment: This article introduces the RAI impact assessment, emphasizing aligning AI with responsible principles. It mentions Microsoft's tools like the Responsible AI Standard, v2, RAI Impact Assessment Template, and Guide. The approach involves identifying use cases, stakeholders, harms, and risk mitigation. It suggests adapting RAI to organizational needs and phased alignment with product releases. ⭐ Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation: In the fourth installment of this tutorial series, the focus is on implementing RAG Fusion, a technique to improve Retrieval-Augmented Generation (RAG) applications. It involves converting user queries into multiple questions, searching for content in a knowledge base, and re-ranking results. The tutorial aims to enhance semantic search in RAG applications. ⭐ Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework: The article discusses challenges in making a production-ready RAG application, highlighting the need to assess retriever and generator components separately and together. It introduces the RAGAs framework for reference-free evaluation using LLMs, offering metrics for component-level assessment. The article provides a guide to using RAGAs for evaluation, including prerequisites, setup, data preparation, and conducting assessments. 🔛 Masterclass: AI/LLM Tutorials⭐ Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse: Discover how to build a multimodal conversational model using LLaVA-1.5, Hugging Face Transformers, and Runhouse. The post introduces the significance of multimodal conversational models, blending language and visual elements. It emphasizes the limitations of closed-source models, showcasing open-source alternatives. The tutorial includes Python code available on GitHub for deploying a vision chat assistant, providing a step-by-step guide. LLaVA-1.5, with its innovative visual embeddings, is explained, highlighting its lightweight training and impressive performance. The tutorial's implementation code, building a vision chatbot, is made accessible through standardized chat templates, and the Runhouse platform simplifies deployment on various infrastructures. ⭐ Fine-Tuning LLMs: A Comprehensive Guide: Explore the potential of fine-tuning OpenAI’s LLMs to revolutionize tasks such as customer support chatbots and financial data analysis. Learn how fine-tuning enhances LLM performance on specific datasets and discover use cases in customer support and finance. The guide walks you through the step-by-step process of fine-tuning, from preparing a training dataset to creating and using a fine-tuned model. Experience how fine-tuned LLMs, exemplified by GPT-3.5 Turbo, can transform natural language processing, opening new possibilities for diverse industries and applications. ⭐ Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart: Embark on a comprehensive guide to creating a web user interface, named Chat Studio, enabling seamless interaction with LLMs like Llama 2 and Stable Diffusion through Amazon SageMaker JumpStart. Learn how to deploy SageMaker foundation models, set up AWS Lambda, IAM permissions, and run the user interface locally. Explore optional extensions to incorporate additional foundation models and deploy the application using AWS Amplify. This step-by-step tutorial covers prerequisites, deployment, solution architecture, and offers insights into the potential of LLMs, providing a hands-on approach for users to enhance conversational experiences and experiment with diverse pre-trained LLMs on AWS. ⭐ Mitigating Hallucinations with Retrieval Augmented Generation: Delve into a step-by-step guide exploring the deployment of LLMs, specifically Llama-2 from Amazon SageMaker JumpStart. Learn the crucial technique of RAG using the Pinecone vector database to counteract AI hallucinations. The primer introduces source knowledge incorporation through RAG, detailing how to set up Amazon SageMaker Studio for LLM pipelines. Discover two approaches to deploy LLMs using HuggingFaceModel and JumpStartModel. The guide further illustrates querying pre-trained LLMs and enhancing accuracy by providing additional context.   🚀 HackHub: Trending AI Tools⭐ bricks-cloud/BricksLLM: Cloud-native AI gateway written in Go enabling the creation of API keys with fine-grained access controls, rate limits, cost limits, and TTLs for both development and production use. ⭐ kwaikeg/kwaiagents: Comprises KAgentSys-Lite with limited tools, KAgentLMs featuring LLMs with agent capabilities, KAgentInstruct providing finetuning data, and KAgentBench offering over 3,000 human-edited evaluations for testing agent capabilities. ⭐ facebookresearch/Pearl: Production-ready Reinforcement Learning AI agent library from Meta prioritizing long-term feedback, adaptability to diverse environments, and resilience to limited observability. ⭐ andvg3/LSDM: Official implementation of a NeurIPS 2023 paper on Language-driven Scene Synthesis using a Multi-conditional Diffusion Model. AI_Distilled Talkback: Unmasking the Community Buzz! 💬 Q: “How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?”  💭 "With providing more information on LLM."  Share your thoughts here! Your opinions matter—let's make this space a reflection of diverse perspectives.
Read more
  • 0
  • 0
  • 276

article-image-metagpt-cybersecuritys-impact-on-investment-choices
James Bryant, Alok Mukherjee
13 Dec 2023
9 min read
Save for later

MetaGPT: Cybersecurity's Impact on Investment Choices

James Bryant, Alok Mukherjee
13 Dec 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, The Future of Finance with ChatGPT and Power BI, by James Bryant, Alok Mukherjee. Enhance decision-making, transform your market approach, and find investment opportunities by exploring AI, finance, and data visualization with ChatGPT's analytics and Power BI's visuals.IntroductionThe MetaGPT model is a highly advanced and customizable model that has been designed to address specific research and analysis needs within various domains. In this particular context, it’s geared towards identifying investment opportunities within the US market that are influenced by cybersecurity regulatory changes or cyber breaches.Roles and responsibilitiesThe model has been configured to perform various specialized roles, including these:Cybersecurity regulatory research: Understanding changes in cybersecurity laws and regulations and their impact on the marketCyber breach analysis: Investigating cyber breaches, understanding their nature, and identifying potential investment risks or opportunitiesInvestment analysis: Evaluating investment opportunities based on insights derived from cybersecurity changesTrading decisions: Making informed buy or sell decisions on financial productsPortfolio management: Overseeing and aligning the investment portfolio based on cybersecurity dynamics Here’s how it worksResearch phase: The model initiates research on the given topics, either cybersecurity regulations or breaches, depending on the role. It breaks down the topic into searchable queries, collects relevant data, ranks URLs based on credibility, and summarizes the gathered information.Analysis phase: Investment analysts then evaluate the summarized information to identify trends, insights, and potential investment opportunities or risks. They correlate cybersecurity data with market behavior, investment potential, and risk factors.Trading phase: Based on the analysis, investment traders execute appropriate trading decisions, buying or selling assets that are influenced by the cybersecurity landscape.Management phase: The portfolio manager integrates all the insights to make overarching decisions about asset allocation, risk management, and alignment of the investment portfolio.The following are its purposes and benefits:Timely insights: By automating the research and analysis process, the model provides quick insights into a dynamic field such as cybersecurity, where changes can have immediate market impactsData-driven decisions: The model ensures that investment decisions are grounded in comprehensive research and objective analysis, minimizing biasCustomization: The model can be tailored to focus on specific aspects of cybersecurity, such as regulatory changes or particular types of breaches, allowing for targeted investment strategiesCollaboration: By defining different roles, the model simulates a collaborative approach, where various experts contribute their specialized knowledge to achieve a common investment goalIn conclusion, the MetaGPT model, with its diverse roles and sophisticated functions, serves as a powerful tool for investors looking to leverage the ever-changing landscape of cybersecurity. By integrating research, analysis, trading, and portfolio management, it provides a comprehensive, datadriven approach to identifying and capitalizing on investment opportunities arising from the complex interplay of cybersecurity and finance. It not only streamlines the investment process but also enhances the accuracy and relevance of investment decisions in a rapidly evolving field.Source: GitHub: MIT License: https://github.com/geekan/MetaGPT.Source: MetaGPT: Meta Programming for Multi-Agent Collaborative Framework Paper:[2308.00352] MetaGPT: Meta Programming for Multi-Agent Collaborative Framework (arxiv.org) (https://arxiv.org/abs/2308.00352)By Sirui Hong, Xiawu Zheng, Jonathan Chen, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin WuThe following is a Python code snippet:1. Begin with the installations:npm --version sudo npm install -g @mermaid-js/mermaid-cli git clone https://github.com/geekan/metagpt cd metagpt python setup.py install2.  Run the following Python code:# Configuration: OpenAI API Key # Open the config/key.yaml file and insert your OpenAI API key in place of the placeholder. # cp config/config.yaml config/key.yaml # save and close file # Import Necessary Libraries import asyncio import json from typing import Callable from pydantic import parse_obj_as # Import MetaGPT Specific Modules from metagpt.actions import Action from metagpt.config import CONFIG from metagpt.logs import logger from metagpt.tools.search_engine import SearchEngine from metagpt.tools.web_browser_engine import WebBrowserEngine, WebBrowserEngineType from metagpt.utils.text import generate_prompt_chunk, reduce_ message_length # Define Roles # NOTE: Replace these role definitions as per your project's needs. RESEARCHER_ROLES = { 'cybersecurity_regulatory_researcher': "Cybersecurity  Regulatory Researcher", 'cyber_breach_researcher': "Cyber Breach Researcher", 'investment_analyst': "Investment Analyst", 'investment_trader': "Investment Trader", 'portfolio_manager': "Portfolio Manager" } # Define Prompts # NOTE: Customize these prompts to suit your project's specific requirements. LANG_PROMPT = "Please respond in {language}." RESEARCH_BASE_SYSTEM = """You are a {role}. Your primary goal is  to understand and analyze \ changes in cybersecurity regulations or breaches, identify  investment opportunities, and make informed \ decisions on financial products, aligning with the current  cybersecurity landscape.""" RESEARCH_TOPIC_SYSTEM = "You are a {role}, and your research  topic is \"{topic}\"." SEARCH_TOPIC_PROMPT = """Please provide up to 2 necessary  keywords related to your \ research topic on cybersecurity regulations or breaches that  require Google search. \ Your response must be in JSON format, for example:  ["cybersecurity regulations", "cyber breach analysis"].""" SUMMARIZE_SEARCH_PROMPT = """### Requirements 1. The keywords related to your research topic and the search results are shown in the "Reference Information" section. 2. Provide up to {decomposition_nums} queries related to your research topic based on the search results. 3. Please respond in JSON format as follows: ["query1",  "query2", "query3", ...]. ### Reference Information {search} """ DECOMPOSITION_PROMPT = """You are a {role}, and before delving  into a research topic, you break it down into several \ sub-questions. These sub-questions can be researched through online searches to gather objective opinions about the given \ topic. --- The topic is: {topic} --- Now, please break down the provided research topic into  {decomposition_nums} search questions. You should respond with  an array of \ strings in JSON format like ["question1", "question2", ...]. """ COLLECT_AND_RANKURLS_PROMPT = """### Reference Information 1. Research Topic: "{topic}" 2. Query: "{query}" 3. The online search results: {results} --- Please remove irrelevant search results that are not related to the query or research topic. Then, sort the remaining search  results \ based on link credibility. If two results have equal credibility, prioritize them based on relevance. Provide the  ranked \ results' indices in JSON format, like [0, 1, 3, 4, ...], without including other words. """ WEB_BROWSE_AND_SUMMARIZE_PROMPT = '''### Requirements 1. Utilize the text in the "Reference Information" section to respond to the question "{query}". 2. If the question cannot be directly answered using the text,  but the text is related to the research topic, please provide \ a comprehensive summary of the text. 3. If the text is entirely unrelated to the research topic,  please reply with a simple text "Not relevant." 4. Include all relevant factual information, numbers, statistics, etc., if available. ### Reference Information {content} ''' CONDUCT_RESEARCH_PROMPT = '''### Reference Information {content} ### Requirements Please provide a detailed research report on the topic:  "{topic}", focusing on investment opportunities arising \ from changes in cybersecurity regulations or breaches. The report must: - Identify and analyze investment opportunities in the US market. - Detail how and when to invest, the structure for the investment, and the implementation and exit strategies. - Adhere to APA style guidelines and include a minimum word count of 2,000. - Include all source URLs in APA format at the end of the report. ''' # Roles RESEARCHER_ROLES = { 'cybersecurity_regulatory_researcher': "Cybersecurity  Regulatory Researcher", 'cyber_breach_researcher': "Cyber Breach Researcher", 'investment_analyst': "Investment Analyst", 'investment_trader': "Investment Trader", 'portfolio_manager': "Portfolio Manager" } # The rest of the classes and functions remain unchangedImportant notesExecute the installation and setup commands in your terminal before running the Python scriptDon’t forget to replace placeholder texts in config files and the Python script with actual data or API keysEnsure that MetaGPT is properly installed and configured on your machineIn this high-stakes exploration, we dissect the exhilarating yet precarious world of LLM-integrated applications. We delve into how they’re transforming finance while posing emergent ethical dilemmas and security risks that simply cannot be ignored. Be prepared to journey through real-world case studies that highlight the good, the bad, and the downright ugly of LLM applications in finance, from market-beating hedge funds to costly security breaches and ethical pitfalls.Conclusion"In an era shaped by cyber landscapes, MetaGPT emerges as the guiding light for astute investors. Seamlessly blending cybersecurity insights with finance, it pioneers a data-driven approach, unveiling opportunities and risks often concealed within regulatory shifts and breaches. This model isn't just a tool; it's the compass navigating the ever-changing intersection of cybersecurity and finance, empowering investors to thrive in an intricate, high-stakes market."Author BioJames Bryant, a finance and technology expert, excels at identifying untapped opportunities and leveraging cutting-edge tools to optimize financial processes. With expertise in finance automation, risk management, investments, trading, and banking, he's known for staying ahead of trends and driving innovation in the financial industry. James has built corporate treasuries like Salesforce and transformed companies like Stanford Health Care through digital innovation. He is passionate about sharing his knowledge and empowering others to excel in finance. Outside of work, James enjoys skiing with his family in Lake Tahoe, running half marathons, and exploring new destinations and culinary experiences with his wife and daughter.Aloke Mukherjee is a seasoned technologist with over a decade of experience in business architecture, digital transformation, and solutions architecture. He excels at applying data-driven solutions to real-world problems and has proficiency in data analytics and planning. Aloke worked at EMC Corp and Genentech and currently spearheads the digital transformation of Finance Business Intelligence at Stanford Health Care. In addition to his work, Aloke is a Certified Personal Trainer and is passionate about helping his clients stay fit. Aloke also has a passion for wine and exploring new vineyards.
Read more
  • 0
  • 0
  • 126

article-image-ai-distilled-28-unveiling-innovations-reshaping-our-world
Merlyn Shelley
11 Dec 2023
13 min read
Save for later

AI_Distilled #28: Unveiling Innovations Reshaping Our World

Merlyn Shelley
11 Dec 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Generative AI has the potential to change the world in ways that we can’t even imagine. It has the power to create new ideas, products, and services that will make our lives easier, more productive, and more creative. It also has the potential to solve some of the world’s biggest problems, such as climate change, poverty, and disease.” -Bill Gates, Microsoft Co-Founder Microsoft Bing’s new Deep Search functionality is a case in point — Bing will now create AI prompts itself to provide detailed insights to user queries in ways traditional search engines can’t even match. Who could have thought LLMs would progress so much they would eventually prompt themselves? Even Runway ML is onto something big with its groundbreaking technology that creates realistic AI generated videos that will find their way to Hollywood. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:  Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race Bing’s New Deep Search Expands Queries AI Takes Center Stage in 2023 Word of the Year Lists OpenAI Announces Delay in GPT Store Launch to Next Year ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers Efficient LLM Fine-Tuning with QLoRA on a Laptop Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance Understanding and Mitigating Biases and Toxicity in LLMs Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition Looking for some inspiration? Here are some GitHub repositories to get your projects going! tacju/maxtron Tanuki/tanuki.py roboflow/multimodal-maestro 03axdov/muskie Also, don't forget to check our expert insights column, which covers the interesting concepts of NLP from the book 'The Handbook of NLP with Gensim'. It's a must-read!    Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQuick question: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF.  Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🏐 Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race: xAI is on a quest to secure $1 billion in equity, aiming to stay competitive with tech giants like OpenAI, Microsoft, and Google in the dynamic AI landscape. Already amassing $135 million from investors, xAI's total funding goal is disclosed in a filing with the US Securities and Exchange Commission.  🏐 AI Alliance Launched by Tech Giants IBM and Meta: IBM and Meta have formed a new "AI Alliance" with over 50 partners to promote open and responsible AI development. Members include Dell, Intel, CERN, NASA and Sony. The alliance envisions fostering an open AI community for researchers and developers and can help members make progress if they openly share models or not. 🏐 Bing’s New Deep Search Expands Queries: Microsoft is testing a new Bing feature called Deep Search that uses GPT-4 to expand search queries before providing results. Deep Search displays the expanded topics in a panel for users to select the one that best fits what they want to know. It then tailors the search results to that description. Microsoft says the feature can take up to 30 seconds due to the AI generation. 🏐 AI Takes Center Stage in 2023 Word of the Year Lists: In 2023, AI dominates tech, influencing "word of the year" choices. Cambridge picks "hallucinate" for AI's tendency to invent information; Merriam-Webster chooses "authentic" to address AI's impact on reality. Oxford recognizes "prompt" for its evolved role in instructing generative AI, reflecting society's increased integration of AI into everyday language and culture. 🏐 OpenAI Announces Delay in GPT Store Launch to Next Year: OpenAI delays the GPT store release until next year, citing unexpected challenges and postponing the initial December launch plan. Despite recent challenges, including CEO changes and employee unrest, development continues, and updates for ChatGPT are expected. The GPT store aims to be a marketplace for users to sell and share custom GPTs, with creators compensated based on usage. 🏐 ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone: ChatGPT's mobile apps, launched in May 2023 on iOS and later on Android, have exceeded 110 million installs, yielding nearly $30 million in revenue. The success is fueled by the ChatGPT Plus subscription, offering perks. Despite competition, downloads surge, with Android hitting 18 million in a week. The company expects continued growth by year-end 2023. 🏐 Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising: NYC video AI startup Runway ML, backed by Google and NVIDIA, announces a partnership with Getty Images for the Runway <> Getty Images Model (RGM), a generative AI video model. Targeting Hollywood, advertising, media, and broadcasting, it enables customized content workflows for Runway enterprise customers. 🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo NLU + NLG = NLP NLP is an umbrella term that covers natural language understanding (NLU) and NLG. We’ll go through both in the next sections. NLU Many languages, such as English, German, and Chinese, have been developing for hundreds of years and continue to evolve. Humans can use languages artfully in various social contexts. Now, we are asking a computer to understand human language. What’s very rudimentary to us may not be so apparent to a computer. Linguists have contributed much to the development of computers’ understanding in terms of syntax, semantics, phonology, morphology, and pragmatics. NLU focuses on understanding the meaning of human language. It extracts text or speech input and then analyzes the syntax, semantics, phonology, morphology, and pragmatics in the language. Let’s briefly go over each one: Syntax: This is about the study of how words are arranged to form phrases and clauses, as well as the use of punctuation, order of words, and sentences. Semantics: This is about the possible meanings of a sentence based on the interactions between words in the sentence. It is concerned with the interpretation of language, rather than its form or structure. For example, the word “table” as a noun can refer to “a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs” or a data frame in a computer language. NLU can understand the two meanings of a word in such jokes through a technique called word embedding.  Phonology: This is about the study of the sound system of a language, including the sounds of speech (phonemes), how they are combined to form words (morphology), and how they are organized into larger units such as syllables and stress patterns. For example, the sounds represented by the letters “p” and “b” in English are distinct phonemes. A phoneme is the smallest unit of sound in a language that can change the meaning of a word. Consider the words “pat” and “bat.” The only difference between these two words is the initial sound, but their meanings are different. Morphology: This is the study of the structure of words, including the way in which they are formed from smaller units of meaning called morphemes. It originally comes from “morph,” the shape or form, and “ology,” the study of something. Morphology is important because it helps us understand how words are formed and how they relate to each other. It also helps us understand how words change over time and how they are related to other words in a language. For example, the word “unkindness” consists of three separate morphemes: the prefix “un-,” the root “kind,” and the suffix “-ness.” Pragmatics: This is the study of how language is used in a social context. Pragmatics is important because it helps us understand how language works in real-world situations, and how language can be used to convey meaning and achieve specific purposes. For example, if you offer to buy your friend a McDonald’s burger, a large fries, and a large drink, your friend may reply "no" because he is concerned about becoming fat. Your friend may simply mean the burger meal is high in calories, but the conversation can also imply he may be fat in a social context. Now, let’s understand NLG. NLG While NLU is concerned with reading for a computer to comprehend, NLG is about writing for a computer to write. The term generation in NLG refers to an NLP model generating meaningful words or even articles. Today, when you compose an email or type a sentence in an app, it presents possible words to complete your sentence or performs automatic correction. These are applications of NLG.  This content is from the book The Handbook of NLP with Gensim - By Chris Kuo (Oct 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🏀 Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers: Discover seven cutting-edge libraries to enhance development projects with advanced AI features. From CopilotTextarea for AI-driven writing in React apps to PrivateGPT for secure, locally processed document interactions, explore tools that elevate your projects and impress users. 🏀 Efficient LLM Fine-Tuning with QLoRA on a Laptop: Explore QLoRA, an efficient memory-saving method for fine-tuning large language models on ordinary CPUs. The QLoRA API supports NF4, FP4, INT4, and INT8 data types for quantization, utilizing methods like LoRA and gradient checkpointing to significantly reduce memory requirements. Learn to implement QLoRA on CPUs, leveraging Intel Extension for Transformers, with experiments showcasing its efficiency on consumer-level CPUs. 🏀 Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint: Learn to swiftly deploy open-source LLMs into applications with a tutorial, featuring the Llama-2 70B model and AutoGen framework. Utilize tools like Runpod and vLLM for computational resources and API endpoint creation, with a step-by-step guide and the option for non-gated models like Falcon-40B. 🏀 Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance: Learn optimization techniques for RAG applications by focusing on hyperparameters, tuning strategies, data ingestion, and pipeline preparation. Explore improvements in inferencing through query transformations, retrieval parameters, advanced strategies, re-ranking models, LLMs, and prompt engineering for enhanced retrieval and generation. 🏀 Understanding and Mitigating Biases and Toxicity in LLMs: Explore the impact of ethical guidelines on Large Language Model (LLM) development, examining measures adopted by companies like OpenAI and Google to address biases and toxicity. Research covers content generation, jailbreaking, and biases in diverse domains, revealing complexities and challenges in ensuring ethical LLMs.  🔛 Masterclass: AI/LLM Tutorials🎯 A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines: Learn to optimize the development loop for your LLM-powered recommendation system by addressing slow processing times in data pipelines. The solution involves implementing a Pipeline class to save inputs/outputs, enabling efficient error debugging. Enhance developer experience with individual pipeline stages as functions and consider future optimizations like error classes and concurrency. 🎯 Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA: Explore fine-tuning Mistral Instruct 7B, an open-source LLM, for medical entrance exam questions using the MedMCQA dataset. Utilize Google Colab, GPTQ version, and LoRA technique for memory efficiency. The tutorial covers data loading, prompt creation, configuration, training setup, code snippets, and performance evaluation, offering a foundation for experimentation and enhancement. 🎯 Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library: This guide details ways to boost Large Language Model (LLM) training speed with Amazon SageMaker's SMDDP. It addresses challenges in distributed training, emphasizing SMDDP's optimized AllGather for GPU communication bottleneck, exploring techniques like EFA network usage, GDRCopy coordination, and reduced GPU streaming multiprocessors for improved efficiency and cost-effectiveness on Amazon SageMaker. 🎯 Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition: The article highlights achieving three times faster inference for public LoRAs using the Diffusers library. It introduces LoRA, a parameter-efficient fine-tuning technique, detailing its decomposition process and benefits, including quick transitions and reduced warm-up and response times in the Inference API.  🚀 HackHub: Trending AI Tools⚽ tacju/maxtron: Unified meta-architecture for video segmentation, enhancing clip-level segmenters with within-clip and cross-clip tracking modules. ⚽ Tanuki/tanuki.py: Simplifies the creation of apps powered by LLMs in Python by seamlessly integrating well-typed, reliable, and stateless LLM-powered functions into applications. ⚽ roboflow/multimodal-maestro: Empowers developers with enhanced control over large multimodal models, enabling the achievement of diverse outputs through effective prompting tactics. ⚽ 03axdov/muskie: Python-based ML library that simplifies the process of dataset creation and model utilization, aiming to reduce code complexity. 
Read more
  • 0
  • 0
  • 278

article-image-chatgpt-for-financial-analysis-palo-alto-networks
James Bryant, Alok Mukherjee
05 Dec 2023
11 min read
Save for later

ChatGPT for Financial Analysis: Palo Alto Networks

James Bryant, Alok Mukherjee
05 Dec 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, The Future of Finance with ChatGPT and Power BI, by James Bryant, Alok Mukherjee. Enhance decision-making, transform your market approach, and find investment opportunities by exploring AI, finance, and data visualization with ChatGPT's analytics and Power BI's visualsIntroductionIn this section, we will explore an interesting example of how ChatGPT can be used to analyze and summarize earnings reports, enabling you to identify key insights and trends quickly. With the vast amount of information available in earnings reports, it can be challenging to sift through data and identify the most critical elements. Let’s see how ChatGPT can help.Here’s the scenario – Palo Alto Networks has just released its quarterly earnings report. You want to understand the company’s financial performance and identify any trends or potential issues that may impact the stock price or investment potential:Step 1 – Extract key data points:To get started, provide ChatGPT with the relevant earnings report data, such as revenue, net income, EPS, and any other important metrics. Be sure to include both current and historical data for comparison purposes. You can either input this data manually or automate the process using an API or web scraper. Let’s explore the automated process to add Palo Alto Networks’ financial information from September 2021 to March 2023 to ChatGPT.Step 1.1 – Automating data collection with Python and API/web scraping:1.  Choose a financial API or web scraping library in Python:If using an API, explore options such as Alpha Vantage (alphavantage.co):Obtain an API key from the Alpha Vantage website (free and paid versions).Choose a method – Python requests.  Make a request.If web scraping, use libraries such as Requests and Beautiful SoupFor web scraping, identify the URLs of the company’s financial statements or earnings reports from websites such as Yahoo Finance (finance.yahoo.com), Nasdaq (nasdaq. com), or the company’s investor relations page.2. Set up your Python script for data collection:For APIs: a. Import the necessary libraries (e.g., requests or pandas) – for example, import requests import pandas as pd. b. Define the API key, endpoint URL, and required parameters. c. Make a request to the API to fetch data using the requests library. d. Parse the response data and convert it into a pandas DataFrame.For web scraping: a. Import the necessary libraries (e.g., requests, BeautifulSoup, or pandas) – for example, import requests from bs4 import BeautifulSoup import pandas as pd. b. Define the URL(s) containing the financial data. c. Use the requests library to fetch the HTML content of the web page. d. Parse the HTML content using BeautifulSoup to extract the required financial data. e. Convert the extracted data into a pandas DataFrame.3. Collect historical data from September 2021 to March 2023 for the relevant financial metrics:Adjust the parameters in your API request or web scraping script to target the specified date range.4. Save the collected data in a structured format, such as a CSV file or a pandas DataFrame, for further processing and analysis:Use pandas’ DataFrame.to_csv() method to save the collected data as a CSV fileAlternatively, keep the data in a pandas DataFrame for further analysis within the Python script.With these additions, you should have a better understanding of where to obtain financial data and the necessary Python libraries to import for their data collection scripts.We will now provide a step-by-step guide using Python code for Palo Alto Networks’ financial data.Extract Palo Alto Networks’ quarterly financial data (revenue, net income, and EPS) for the time period September 2021–March 2023, and save it in a CSV file as text input, using the Alpha Vantage API key (finance website):1. Install the necessary Python package and pandas library in Command Prompt:pip install requests pip install pandas2. Create a new Python script file in Notepad, Notepad++, PyCharm, or Visual Studio code. It is important that you add your Alpha Vantage API key in the following api_key line. Copy and paste the following code into your Python script file, and name it PANW.py:import requests import pandas as pd api_key = "YOUR_API_KEY" symbol = "PANW" url = f"https://www.alphavantage.co/ query?function=EARNINGS&symbol={symbol}&apikey={api_key}" try: response = requests.get(url) response.raise_for_status() # Raise HTTPError for bad responses data = response.json() if 'quarterlyEarnings' in data: quarterly_data = data['quarterlyEarnings'] df = pd.DataFrame(quarterly_data) df_filtered = df[(df['reportedDate'] >= '2021-09-01') & (df['reportedDate'] <= '2023-03-31')] df_filtered.to_csv("palo_alto_financial_data.csv", index=False) input_text = "Analyze the earnings data of Palo Alto Networks from September 2021 to March 2023.\n\n" for idx, row in df_filtered.iterrows(): quarter = idx + 1 revenue = row.get('revenue', 'N/A') net_income = row.get('netIncome', 'N/A') eps = row.get('earningsPerShare', 'N/A') input_text += f"Quarter {quarter}:\n" input_text += f"Revenue: ${revenue}\n" input_text += f"Net Income: ${net_income}\n" input_text += f"Earnings Per Share: ${eps}\n\n" with open("palo_alto_financial_summary.txt", "w") as f: f.write(input_text) else: print("Data not available.") except requests.RequestException as e: print(f"An error occurred: {e}")3.  Run the Python script file:Python PANW.py4.  A separate text file, palo_alto_financial_summary.txt, and a CSV file, palo_ alto_financial_data.csv, will be created once the Python script has been executed:When the Python script, PANW.py, is executed, it performs several tasks to fetch and analyze the earnings data of Palo Alto Networks (the symbol PANW). First, it imports two essential libraries – requests to make API calls and pandas for data manipulation.The script starts by defining a few key variables – the API key to access financial data, the stock symbol of the company, and the URL to the Alpha Vantage API where the data can be retrieved. Then, a try block is initiated to safely execute the following operations.The script uses the requests.get() method to query the Alpha Vantage API. If the request is successful, the response is parsed as JSON and stored in a variable named data. It then checks whether data contains a key called quarterly earnings.If this key exists, the script proceeds to convert the quarterly earnings data into a pandas DataFrame. It filters this DataFrame to include only the entries between September 2021 and March 2023. The filtered data is then saved as a CSV file named palo_alto_financial_ data.csv:The CSV file contains raw financial data in tabular formThe CSV file can be imported into Excel, Google Sheets, or other specialized data analysis toolsThe script also constructs a text-based summary of the filtered earnings data, including revenue, net income, and EPS for each quarter within the specified date range. This summary is saved as a text file named palo_alto_financial_summary.txt:The TXT file provides a human-readable summary of the financial data for Palo Alto Networks for the specified data rangeTXT files can be used for quick overviews and presentationsIf any errors occur during this process, such as a failed API request, the script will catch these exceptions and print an error message, thanks to the except block. This ensures that the script fails gracefully, providing useful feedback instead of crashing.You can upload the CSV file (palo_alto_financial_data.csv) to ChatGPT directly if you are a ChatGPT Plus user by following these steps:Uploading a CSV file directly into ChatGPT is supported through the Advanced Data Analysis option for ChatGPT Plus users. You can access the OpenAI website at https://openai.com/, and then log in using your login credentials. Once logged in, access your Settings and Beta options by clicking on the three dots near your email address in the bottom-left corner of the screen. Go to Beta features and activate the Advanced data analysis function by moving the slider to the right to activate (the option will turn green). Once this feature is activated, go to GPT-4 at the top center of the screen and then select Advanced Data Analysis from the drop-down list. You can click on the plus sign in the dialog box to upload the CSV file to ChatGPT:•  CSV file size limitations: 500 MB•  CSV file retention: Files are retained while a conversation is active and for three hours after the conversation is pausedIf you are not a ChatGPT Plus user, follow the following instructions using the OpenAI API to upload the CSV file (palo_alto_financial_data.csv) into ChatGPT, and analyze the data using the GPT 3.5 turbo model:1. Create a new Python script file in Notepad, Notepad++, PyCharm, or Visual Studio Code. It is important that you add your OpenAI API key to the following api_key line. Copy and paste the following code into your Python script file and name it OPENAIAPI.py:import openai import pandas as pd df = pd.read_csv("palo_alto_financial_data.csv") csv_string = df.to_string(index=False) api_key = "your_openai_api_key_here" openai.api_key = api_key input_text = f"Here is the financial data for Palo Alto Networks:\n\n{csv_string}\n\nPlease analyze the data and provide insights." response = openai.Completion.create( engine="gpt-3.5-turbo", # Specifying GPT-3.5-turbo engine prompt=input_text, max_tokens=200 # Limiting the length of the generated text ) generated_text = response.choices[0].text.strip() print("GPT-3.5-turbo PANW Analysis:", generated_text)2.  Run the Python script file:Python OPENAIAPI.pyThis Python code snippet is responsible for interacting with the OpenAI API to send the formatted text input (the financial data prompt) to ChatGPT and receive the generated response. Here’s a breakdown of each part:The Python code snippet starts by importing two essential Python libraries – openai for interacting with the OpenAI API, and pandas for data manipulation.The script reads financial data from a CSV file named palo_alto_financial_data. csv using pandas, converting this data into a formatted string. It then sets up the OpenAI API by initializing it with a user-provided API key.Following this, the script prepares a prompt for GPT-3.5-turbo, consisting of the loaded financial data and a request for analysis. This prompt is sent to the GPT-3.5-turbo engine via the OpenAI API, which returns a text-based analysis, limited to 200 tokens.The generated analysis is then extracted from the API’s response and printed to the console with the label “GPT-3.5-turbo PANW Analysis.” The script essentially automates the process of sending financial data to the GPT-3.5-turbo engine for insightful analysis, making it easy to obtain quick, AI-generated insights on Palo Alto Networks’ financial performance.ConclusionIn conclusion, harnessing ChatGPT's capabilities, we've navigated Palo Alto Networks' earnings landscape. From automated data extraction to insightful analysis, this journey unveiled crucial financial trends. Whether utilizing APIs or scraping web data, the process demystified complexities, offering a streamlined approach. By generating comprehensive summaries and interacting with ChatGPT for deeper insights, the pathway to understanding financial data has been simplified. Embracing AI-powered analysis enables swift comprehension of earnings reports, empowering informed decisions in the realm of financial scrutiny and investment strategies.Author BioJames Bryant, a finance and technology expert, excels at identifying untapped opportunities and leveraging cutting-edge tools to optimize financial processes. With expertise in finance automation, risk management, investments, trading, and banking, he's known for staying ahead of trends and driving innovation in the financial industry. James has built corporate treasuries like Salesforce and transformed companies like Stanford Health Care through digital innovation. He is passionate about sharing his knowledge and empowering others to excel in finance. Outside of work, James enjoys skiing with his family in Lake Tahoe, running half marathons, and exploring new destinations and culinary experiences with his wife and daughter.Aloke Mukherjee is a seasoned technologist with over a decade of experience in business architecture, digital transformation, and solutions architecture. He excels at applying data-driven solutions to real-world problems and has proficiency in data analytics and planning. Aloke worked at EMC Corp and Genentech and currently spearheads the digital transformation of Finance Business Intelligence at Stanford Health Care. In addition to his work, Aloke is a Certified Personal Trainer and is passionate about helping his clients stay fit. Aloke also has a passion for wine and exploring new vineyards.
Read more
  • 0
  • 0
  • 209
article-image-ai-distilled-28-your-gen-ai-navigator-latest-news-insights
Merlyn Shelley
04 Dec 2023
13 min read
Save for later

AI_Distilled #28: Your Gen AI Navigator: Latest News & Insights

Merlyn Shelley
04 Dec 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“We will have for the first time something smarter than the smartest human. It's hard to say exactly what that moment is, but there will come a point where no job is needed.” -Elon Musk, CEO, Tesla.  Musk has released an important FSD v12 update to Tesla employees, a move touted as a breakthrough in the realization of true self-driving capabilities powered by neural nets. In the words of Musk, can self-driving cars be smarter than the smartest drivers? Only time will tell, but the future is exciting. Welcome to another AI_Distilled, you one-stop hub for all things Gen AI. Let's kick off today's edition with some of the latest news and analysis across the AI domain: 👉 Amazon and Salesforce Fortify Alliance to Boost AI Integration 👉 Tesla Initiates Rollout of FSD v12 to Employees 👉 Anthropic Unveils Claude 2.1 with Enhanced AI Capabilities 👉 Stability AI Unveils 'Stable Video Diffusion' AI Tool for Animated Images 👉 Amazon Unveils Amazon Q: A Generative AI-Powered Business Assistant 👉 Amazon Introduces New AI Chips for Model Training and Inference 👉 Pika Labs Raises $55M, Launches AI Video Platform 👉 Amazon AWS Unveils Ambitious Generative AI Vision at Re:Invent Next, we'll swiftly explore the secret knowledge column that features some key LLM resources: 💎 How to Enhance LLM Reasoning with System 2 Attention 💎 Unlocking AWS Wisdom with Amazon Q 💎 How to Optimize LLMs on Modest Hardware 💎 How to Assess AI System Risks 💎 Prompting Strategies for Domain-Specific Expertise in GPT-4 Hold on, there's additional news! Discover the hands-on tips and proven methods straight from the AI community:   📍 Customizing Models in Amazon Bedrock 📍 Building an Infinite Chat Memory GPT Voice Assistant in Python 📍 Generating High-Quality Computer Vision Datasets 📍 Understanding LSTM in NLPLooking to expand your AI toolkit on GitHub? Check out these repositories!  ✅ neurocult/agency ✅ lunyiliu/coachlm ✅ 03axdov/muskie ✅ robocorp/llmstatemachine Also, don't forget to check our expert insights column, which covers the interesting concepts of hybrid cloud from the book 'Achieving Digital Transformation Using Hybrid Cloud'. It's a must-read!   Stay curious and gear up for an intellectually enriching experience!  📥 Feedback on the Weekly EditionQuick question: How do you handle data quality issues, such as missing or inconsistent data, to ensure accurate visual representations? Share your valued opinions discreetly! Your insights could shine in our next issue for the 38K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "The Applied Artificial Intelligence Workshop" in PDF.   Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🔸 Amazon and Salesforce Fortify Alliance to Boost AI Integration: Amazon and Salesforce strengthen their partnership, prioritizing AI integration for efficient data management. This collaboration enhances synergy between Salesforce and AWS, with Salesforce expanding its use of AWS technologies, including Hyperforce, while AWS leverages Salesforce products for unified customer profiles and personalized experiences. 🔸 Tesla Initiates Rollout of FSD v12 to Employees, Signaling Progress in Self-Driving Endeavor: Tesla begins the rollout of Full Self-Driving (FSD) v12 to employees, a key move for CEO Elon Musk's self-driving vision. The update shifts controls to neural nets, advancing autonomy. Musk aims to exit beta with v12, removing constant driver monitoring, but concerns persist about Tesla's responsibility and the timeline for full self-driving. 🔸 Anthropic Unveils Claude 2.1 with Enhanced AI Capabilities: Anthropic launches Claude 2.1 via API, featuring a groundbreaking 200K token context window, halving hallucination rates, and a beta tool use function. The expanded context window facilitates processing extensive content, improving accuracy, honesty, and comprehension, particularly in legal and financial documents. Integration capabilities with existing processes enhance Claude's utility in diverse operations. 🔸 Stability AI Unveils 'Stable Video Diffusion' AI Tool for Animated Images: Stability AI introduces Stable Video Diffusion, a free AI research tool that converts static images into brief videos using SVD and SVD-XT models. Running on NVIDIA GPUs, it generates 2-4 second MP4 clips with 576x1024 resolution, featuring dynamic scenes through panning, zooming, and animated effects. 🔸 Amazon Unveils Amazon Q: A Generative AI-Powered Business Assistant: Amazon Q is a new generative AI assistant for businesses, facilitating streamlined tasks, quick decision-making, and innovation. It engages in conversations, solves problems, and generates content by connecting to company information. Customizable plans prioritize user privacy and data security, enabling deployment in various tasks, from press releases to social media posts. 🔸 Amazon Introduces New AI Chips for Model Training and Inference: Amazon has launched new chips, including AWS Trainium2 and Graviton4, addressing GPU shortages for generative AI. Trainium2 boasts 4x performance and 2x energy efficiency, with a cluster of 100,000 chips capable of swift AI LLM training. Graviton4 targets inferencing, aiming to lessen GPU dependence, aligning with Amazon's commitment to meet rising AI demands. 🔸 Pika Labs Raises $55M, Launches AI Video Platform: Pika Labs, a video AI startup, secures $55 million in funding, led by a $35 million series A round from Lightspeed Venture Partners. They unveil Pika 1.0, a web platform enabling easy text prompt-based video creation and editing in diverse styles. Already used by 500,000+, the product aims to rival AI video generation platforms like Runway and Stability AI, as well as Adobe tools. 🔸 Amazon AWS Unveils Ambitious Generative AI Vision at Re:Invent: Amazon aims to lead in generative AI, surpassing rivals Azure and Google Cloud. Emphasizing the Bedrock service's diverse generative AI models and user-friendly data tools, Amazon focuses on enhancing Bedrock and introducing gen AI features to Amazon Quicksight for business intelligence applications.  🔮 Expert Insights from Packt Community Achieving Digital Transformation Using Hybrid Cloud by Vikas Grover, Ishu Verma, Praveen Rajagopalan Organizations of all sizes and industries appreciate the convenience of adjusting their resources based on demand and only paying for what they use. Leading public cloud service providers and SaaS offerings such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and Salesforce, respectively, have seen significant growth in recent years, catering to the needs of small start-ups and large enterprises alike. Hybrid cloud use cases Hybrid cloud has emerged as a popular solution for organizations looking to balance the benefits of public and private clouds while addressing the data security requirements, compliance needs for regulated applications, and performance and computing needs for applications running at remote edge locations. Here are four use cases that showcase the versatility and flexibility of the hybrid cloud in different industries: Security: A government agency uses a hybrid cloud approach to store sensitive national security data on a private cloud for maximum security while utilizing the public cloud for cost-effective data storage and processing for non-sensitive data. Proprietary Technology: A technology company uses a hybrid cloud approach to store and manage its proprietary software on a private cloud for maximum security and control while utilizing the public cloud for cost-effective development and testing. For example, financial service companies manage trading platforms on the private cloud for maximum control while using the public cloud for running simulations and back-testing algorithms. Competitive Edge: A retail company uses a hybrid cloud solution to store critical sales and customer information on a private cloud for security and compliance while utilizing the public cloud for real-time data analysis to gain a competitive edge by offering personalized customer experiences and insights. Telecom: A telecommunications company uses a hybrid cloud approach to securely store sensitive customer information on a private cloud while utilizing the public cloud for real-time data processing and analysis to improve network performance and customer experience. This approach helps the company maintain a competitive edge in the telecom sector by providing a superior network experience to its customers. Understanding the benefits of hybrid cloud computing A hybrid cloud provides a flexible solution. Many organizations have embraced and adopted the hybrid cloud. If we take an example of a cable company, Comcast (the world’s largest cable company), as per a technical paper published by Comcast for SCTE-ISBE, Comcast serves tens of millions of customers and hosts hundreds of tenants in eight regions and three public clouds. This is a great testimony of using a hybrid cloud for mission-critical workloads that need to run at scale. Hybrid cloud is more popular than ever and some of the reasons that organizations are adopting a hybrid cloud are as follows: Time to market: With choices available to your IT teams to leverage appropriate resources as needed by use case, new applications and services can be launched quickly. Manage costs: Hybrid cloud helps you with optimizing and consuming resources efficiently. Make use of your current investments in existing infrastructure and when needed to scale, burst the workloads in the public cloud. Reduced lock-in: Going into the cloud may be appealing, but once in and when costs start to rise and eat the bottom line of the organization, it would be another costly proposition to reverse-migrate some of your applications from the public cloud. A hybrid cloud allows you to run anywhere and reduces your lock-in. Gaining a competitive advantage: In the competitive world of business, relying solely on public cloud technologies can put you at a disadvantage. To stay ahead of the competition, it’s important to maintain control over and ownership of cutting-edge technologies. This way, you can build and grow your business in an increasingly competitive environment. This content is from the book Achieving Digital Transformation Using Hybrid Cloud by Vikas Grover, Ishu Verma, Praveen Rajagopalan (July 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below.   Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🔸 How to Enhance LLM Reasoning with System 2 Attention: Meta researchers introduce System 2 Attention (S2A), a revolutionary technique enhancing Large Language Models (LLMs) by refining user prompts through psychological inspiration. S2A focuses on task-relevant data, boosting LLMs' accuracy in reasoning tasks by eliminating irrelevant information and instructing them to generate context effectively. 🔸 Unlocking AWS Wisdom with Amazon Q: A Guide for Optimal Decision-Making: Amazon Q, a robust chatbot trained on 17 years of AWS documentation, transforms AWS task execution. Explore its prowess in navigating AWS services intricacies, offering insights on serverless vs. containers and database choices. Enhance accuracy with expert guidance on AWS Well Architected Framework, troubleshooting, workload optimization, and content creation. 🔸 How to Optimize LLMs on Modest Hardware: Quantization, a key technique for running large language models on less powerful hardware, reduces model parameters' precision. PyTorch offers dynamic, static, and quantization-aware training strategies, each balancing model size, computational demand, and accuracy. Choosing hardware involves understanding the critical role of VRAM, challenging the notion that newer GPUs are always superior. 🔸 How to Assess AI System Risks: A Comprehensive Guide: Explore the nuanced realm of AI risk assessment in this guide, covering model and enterprise risks for responsible AI development. Understand the importance of defining inherent and residual risks, utilizing the NIST Risk Management Framework, and involving diverse stakeholders. Learn to evaluate risks using likelihood and severity scales, employing a risk matrix. 🔸 The Effectiveness of Prompting Strategies for Domain-Specific Expertise in GPT-4: This study explores prompting strategies to leverage domain-specific expertise from the versatile GPT-4 model. It reveals GPT-4's exceptional performance as a medical specialist, surpassing finely-tuned medical models. Medprompt, a combination of prompting strategies, enables GPT-4 to achieve over 90% accuracy on the challenging MedQA dataset, challenging the conventional need for extensive fine-tuning and showcasing the broad applicability of generalist models across diverse domains. 🔛 Masterclass: AI/LLM Tutorials🔸 Customizing Models in Amazon Bedrock: A Step-by-Step Guide: Embark on the journey of tailoring foundation models in Amazon Bedrock to align with your specific domain and organizational needs, enriching user experiences. This comprehensive guide introduces two customization options: fine-tuning and continued pre-training. Learn how to enhance model accuracy through fine-tuning using your task-specific labeled dataset and explore the process of creating fine-tuning jobs via the Amazon Bedrock console or APIs. Additionally, explore continued pre-training, available in public preview for Amazon Titan Text models, understanding its benefits in making models more domain-specific. The guide provides practical demos using AWS SDK for Python (Boto3) and offers crucial insights on data privacy, network security, billing, and provisioned throughput.  🔸 Building an Infinite Chat Memory GPT Voice Assistant in Python: Learn to build a customizable GPT voice assistant with OpenAI's cloud assistant feature. This guide explores the assistant API, providing auto-vectorization and intelligent context handling for extensive chat recall. Enjoy advantages like enhanced security, limitless memory, local message history retrieval, and flexible interfaces. Gain essential tools and skills for implementation, including an OpenAI API key, ffmpeg installation, and required Python packages. 🔸 Generating High-Quality Computer Vision Datasets: This guide outlines the process of building a customized and diverse computer vision dataset. It covers generating realistic image prompts with ChatGPT, utilizing a vision image generation model, automating object detection, and labeling. Learn to enhance dataset quality for improved computer vision projects through prompt customization and model utilization. 🔸 Understanding LSTM in NLP: A Python Guide: This guide explores employing Long Short-Term Memory (LSTM) layers for natural language processing in Python. It covers theoretical aspects, details coding of the layer's forward pass, and includes a practical implementation with a dataset, enhancing understanding and application of LSTM in NLP through text data preprocessing and sentiment encoding.  🚀 HackHub: Trending AI Tools🔸 neurocult/agency: Explore the capabilities of LLMs and generative AI with this library designed with a clean, effective, and Go-idiomatic approach. 🔸 lunyiliu/coachlm: Code and data for an automatic instruction revision method tailored for LLM instruction tuning to implement CoachLM and enhance the precision of LLM instruction tuning effortlessly. 🔸 https://github.com/03axdov/muskie: Python-based ML library streamlining the creation of custom datasets and model usage with minimal code requirements. 🔸 robocorp/llmstatemachine: Python library to unlock GPT-powered agents with ease, incorporating state machine logic and chat history memory for seamless development.  
Read more
  • 0
  • 0
  • 153

article-image-deploying-llms-with-amazon-sagemaker-part-2
Joshua Arvin Lat
30 Nov 2023
19 min read
Save for later

Deploying LLMs with Amazon SageMaker - Part 2

Joshua Arvin Lat
30 Nov 2023
19 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionIn the first part of this post, we showed how easy it is to deploy large language models (LLMs) in the cloud using a managed machine learning service called Amazon SageMaker. In just a few steps, we were able to deploy a MistralLite model in a SageMaker Inference Endpoint. If you’ve worked on real ML-powered projects in the past, you probably know that deploying a model is just the first step! There are definitely a few more steps before we can consider that our application is ready for use.If you’re looking for the link to the first part, here it is: Deploying LLMs with Amazon SageMaker - Part 1In this post, we’ll build on top of what we already have in Part 1 and prepare a demo user interface for our chatbot application. That said, we will tackle the following sections in this post:● Section I: Preparing the SageMaker Notebook Instance (discussed in Part 1)● Section II: Deploying an LLM using the SageMaker Python SDK to a SageMaker Inference Endpoint (discussed in Part 1)● Section III: Enabling Data Capture with SageMaker Model Monitor●  Section IV: Invoking the SageMaker inference endpoint using the boto3 client●  Section V: Preparing a Demo UI for our chatbot application●  Section VI: Cleaning UpWithout further ado, let’s begin!Section III: Enabling Data Capture with SageMaker Model MonitorIn order to analyze our deployed LLM, it’s essential that we’re able to collect the requests and responses to a central storage location. Instead of building our own solution that collects the information we need, we can just utilize the built-in Model Monitor capability of SageMaker. Here, all we need to do is prepare the configuration details and run the update_data_capture_config() method of the inference endpoint object and we’ll have the data capture setup enabled right away! That being said, let’s proceed with the steps required to enable and test data capture for our SageMaker Inference endpoint:STEP # 01: Continuing where we left off in Part 1 of this post, let’s get the bucket name of the default bucket used by our session:s3_bucket_name = sagemaker_session.default_bucket() s3_bucket_nameSTEP # 02: In addition to this, let’s prepare and define a few prerequisites as well:prefix = "llm-deployment" base = f"s3://{s3_bucket_name}/{prefix}" s3_capture_upload_path = f"{base}/model-monitor"STEP # 03: Next, let’s define the data capture config:from sagemaker.model_monitor import DataCaptureConfig data_capture_config = DataCaptureConfig(    enable_capture = True,    sampling_percentage=100,    destination_s3_uri=s3_capture_upload_path,    kms_key_id=None,    capture_options=["REQUEST", "RESPONSE"],    csv_content_types=["text/csv"],    json_content_types=["application/json"] )Here, we specify that we’ll be collecting 100% of the requests and responses that pass through the deployed model.STEP # 04: Let’s enable data capture so that we’re able to save in Amazon S3 the request and response data:predictor.update_data_capture_config(    data_capture_config=data_capture_config )Note that this step may take about 8-10 minutes to complete. Feel free to grab a cup of coffee or tea while waiting!STEP # 05: Let’s check if we are able to capture the input request and output response by performing another sample request:result = predictor.predict(input_data)[0]["generated_text"] print(result)This should yield the following output:"The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries. There is no single answer that can be definitively proven, as the meaning of life is subjective and can vary greatly from person to person.\n\nSome people believe that the meaning of life is to find happiness and fulfillment through personal growth, relationships, and experiences. Others believe that the meaning of life is to serve a greater purpose, such as through a religious or spiritual calling, or by making a positive impact on the world through their work or actions.\n\nUltimately, the meaning of life is a personal journey that each individual must discover for themselves. It may involve exploring different beliefs and perspectives, seeking out new experiences, and reflecting on what brings joy and purpose to one's life."Note that it may take a minute or two before the .jsonl file(s) containing the request and response data appear in our S3 bucket.STEP # 06: Let’s prepare a few more examples:prompt_examples = [    "What is the meaning of life?",    "What is the color of love?",    "How to deploy LLMs using SageMaker",    "When do we use Bedrock and when do we use SageMaker?" ] STEP # 07: Let’s also define the perform_request() function which wraps the relevant lines of code for performing a request to our deployed LLM model:def perform_request(prompt, predictor):    input_data = {        "inputs": f"<|prompter|>{prompt}</s><|assistant|>",        "parameters": {            "do_sample": False,            "max_new_tokens": 2000,            "return_full_text": False,        }    }      response = predictor.predict(input_data)    return response[0]["generated_text"] STEP # 08: Let’s quickly test the perform_request() function:perform_request(prompt_examples[0], predictor=predictor)STEP # 09: With everything ready, let’s use the perform_request() function to perform requests using the examples we’ve prepared in an earlier step:from time import sleep for example in prompt_examples:    print("Input:", example)      generated = perform_request(        prompt=example,        predictor=predictor    )    print("Output:", generated)    print("-"*20)    sleep(1)This should return the following:Input: What is the meaning of life? ... -------------------- Input: What is the color of love? Output: The color of love is often associated with red, which is a vibrant and passionate color that is often used to represent love and romance. Red is a warm and intense color that can evoke strong emotions, making it a popular choice for representing love. However, the color of love is not limited to red. Other colors that are often associated with love include pink, which is a softer and more feminine shade of red, and white, which is often used to represent purity and innocence. Ultimately, the color of love is subjective and can vary depending on personal preferences and cultural associations. Some people may associate love with other colors, such as green, which is often used to represent growth and renewal, or blue, which is often used to represent trust and loyalty. ...Note that this is just a portion of the overall output and you should get a relatively long response for each input prompt.Section IV: Invoking the SageMaker inference endpoint using the boto3 clientWhile it’s convenient to use the SageMaker Python SDK to invoke our inference endpoint, it’s best that we also know how to use boto3 as well to invoke our deployed model. This will allow us to invoke the inference endpoint from an AWS Lambda function using boto3.Image 10 — Utilizing API Gateway and AWS Lambda to invoke the deployed LLMThis Lambda function would then be triggered by an event from an API Gateway resource similar to what we have in Image 10. Note that we’re not planning to complete the entire setup in this post but having a working example of how to use boto3 to invoke the SageMaker inference endpoint should easily allow you to build an entire working serverless application utilizing API Gateway and AWS Lambda.STEP # 01: Let’s quickly check the endpoint name of the SageMaker inference endpoint:predictor.endpoint_nameThis should return the endpoint name with a format similar to what we have below:'MistralLite-HKGKFRXURT'STEP # 02: Let’s prepare our boto3 client using the following lines of code:import boto3 import json boto3_client = boto3.client('runtime.sagemaker')STEP # 03: Now, let’s invoke the endpointbody = json.dumps(input_data).encode() response = boto3_client.invoke_endpoint(    EndpointName=predictor.endpoint_name,    ContentType='application/json',    Body=body )   result = json.loads(response['Body'].read().decode())STEP # 04: Let’s quickly inspect the result:resultThis should give us the following:[{'generated_text': "The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries. There is no single answer that can be definitively proven, as the meaning of life is subjective and can vary greatly from person to person..."}] STEP # 05: Let’s try that again and print the output text:result[0]['generated_text']This should yield the following output:"The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries..."STEP # 06: Now, let’s define perform_request_2 which uses the boto3 client to invoke our deployed LLM:def perform_request_2(prompt, boto3_client, predictor):    input_data = {        "inputs": f"<|prompter|>{prompt}</s><|assistant|>",        "parameters": {            "do_sample": False,            "max_new_tokens": 2000,            "return_full_text": False,        }    }      body = json.dumps(input_data).encode()    response = boto3_client.invoke_endpoint(        EndpointName=predictor.endpoint_name,        ContentType='application/json',        Body=body    )      result = json.loads(response['Body'].read().decode())    return result[0]["generated_text"]STEP # 07: Next, let’s run the following block of code to have our deployed LLM answer the same set of questions using the perform_request_2() function:for example in prompt_examples:    print("Input:", example)      generated = perform_request_2(        prompt=example,        boto3_client=boto3_client,        predictor=predictor    )    print("Output:", generated)    print("-"*20)    sleep(1)This will give us the following output:Input: What is the meaning of life? ... -------------------- Input: What is the color of love? Output: The color of love is often associated with red, which is a vibrant and passionate color that is often used to represent love and romance. Red is a warm and intense color that can evoke strong emotions, making it a popular choice for representing love. However, the color of love is not limited to red. Other colors that are often associated with love include pink, which is a softer and more feminine shade of red, and white, which is often used to represent purity and innocence. Ultimately, the color of love is subjective and can vary depending on personal preferences and cultural associations. Some people may associate love with other colors, such as green, which is often used to represent growth and renewal, or blue, which is often used to represent trust and loyalty. ... Given that it may take a few minutes before the .jsonl files appear in our S3 bucket, let’s wait for about 3-5 minutes before proceeding to the next section. Feel free to grab a cup of coffee or tea while waiting!STEP # 08: Let’s run the following block of code to list the captured data files stored in our S3 bucket:results = !aws s3 ls {s3_capture_upload_path} --recursive resultsSTEP # 09: In addition to this, let’s store the list inside the processed variable:processed = [] for result in results:    partial = result.split()[-1]    path = f"s3://{s3_bucket_name}/{partial}"    processed.append(path)   processedSTEP # 10: Let’s create a new directory named captured_data using the mkdir command:!mkdir -p captured_dataSTEP # 11: Now, let’s download the .jsonl files from the S3 bucket to the captured_data directory in our SageMaker Notebook Instance:for index, path in enumerate(processed):    print(index, path)    !aws s3 cp {path} captured_data/{index}.jsonlSTEP # 12: Let’s define the load_json_file() function which will help us load files with JSON content:import json def load_json_file(path):    output = []      with open(path) as f:        output = [json.loads(line) for line in f]          return outputSTEP # 13: Using the load_json_file() function we defined in an earlier step, let’s load the .jsonl files and store them inside the all variable for easier viewing:all = [] for i, _ in enumerate(processed):    print(f">: {i}")    new_records = load_json_file(f"captured_data/{i}.jsonl")    all = all + new_records     allRunning this will yield the following response:Image 11 — All captured data points inside the all variableFeel free to analyze the nested structure stored in all variables. In case you’re interested in how this captured data can be analyzed and processed further, you may check Chapter 8, Model Monitoring and Management Solutions of my 2nd book “Machine Learning Engineering on AWS”.Section V: Preparing a Demo UI for our chatbot applicationYears ago, we had to spend a few hours to a few days before we were able to prepare a user interface for a working demo. If you have not used Gradio before, you would be surprised that it only takes a few lines of code to set everything up. In the next set of steps, we’ll do just that and utilize the model we’ve deployed in the previous parts of our demo application:STEP # 01: Continuing where we left off in the previous part, let’s install a specific version of gradio using the following command:!pip install gradio==3.49.0STEP # 02: We’ll also be using a specific version of fastapi as well:!pip uninstall -y fastapi !pip install fastapi==0.103.1STEP # 03: Let’s prepare a few examples and store them in a list:prompt_examples = [    "What is the meaning of life?",    "What is the color of love?",    "How to deploy LLMs using SageMaker",    "When do we use Bedrock and when do we use SageMaker?",    "Try again",    "Provide 10 alternatives",    "Summarize the previous answer into at most 2 sentences" ]STEP # 04: In addition to this, let’s define the parameters using the following block of code:parameters = {    "do_sample": False,    "max_new_tokens": 2000, }STEP # 05: Next, define the process_and_response() function which we’ll use to invoke the inference endpoint:def process_and_respond(message, chat_history):    processed_chat_history = ""    if len(chat_history) > 0:        for chat in chat_history:            processed_chat_history += f"<|prompter|>{chat[0]}</s><|assistant|>{chat[1]}</s>"              prompt = f"{processed_chat_history}<|prompter|>{message}</s><|assistant|>"    response = predictor.predict({"inputs": prompt, "parameters": parameters})    parsed_response = response[0]["generated_text"][len(prompt):]    chat_history.append((message, parsed_response))    return "", chat_historySTEP # 06: Now, let’s set up and prepare the user interface we’ll use to interact with our chatbot:import gradio as gr with gr.Blocks(theme=gr.themes.Monochrome(spacing_size="sm")) as demo:    with gr.Row():        with gr.Column():                      message = gr.Textbox(label="Chat Message Box",                                 placeholder="Input message here",                                 show_label=True,                                 lines=12)            submit = gr.Button("Submit")                      examples = gr.Examples(examples=prompt_examples,                                   inputs=message)        with gr.Column():            chatbot = gr.Chatbot(height=900)      submit.click(process_and_respond,                 [message, chatbot],                 [message, chatbot],                 queue=False)Here, we can see the power of Gradio as we only needed a few lines of code to prepare a demo app.STEP # 07: Now, let’s launch our demo application using the launch() method:demo.launch(share=True, auth=("admin", "replacethis1234!"))This will yield the following logs:Running on local URL:  http://127.0.0.1:7860 Running on public URL: https://123456789012345.gradio.live STEP # 08: Open the public URL in a new browser tab. This will load a login page which will require us to input the username and password before we are able to access the chatbot.Image 12 — Login pageSpecify admin and replacethis1234! in the login form to proceed.STEP # 09: After signing in using the credentials, we’ll be able to access a chat interface similar to what we have in Image 13. Here, we can try out various types of prompts.Image 13 — The chatbot interfaceHere, we have a Chat Message Box where we can input and run our different prompts on the left side of the screen. We would then see the current conversation on the right side.STEP # 10: Click the first example “What is the meaning of life?”. This will auto-populate the text area similar to what we have in Image 14:Image 14 — Using one of the examples to populate the Chat Message BoxSTEP # 11:Click the Submit button afterwards. After a few seconds, we should get the following response in the chat box:Image 15 — Response of the deployed modelAmazing, right? Here, we just asked the AI what the meaning of life is.STEP # 12: Click the last example “Summarize the previous answer into at most 2 sentences”. This will auto-populate the text area with the said example. Click the Submit button afterward.Image 16 — Summarizing the previous answer into at most 2 sentencesFeel free to try other prompts. Note that we are not limited to the prompts available in the list of examples in the interface.Important Note: Like other similar AI/ML solutions, there's the risk of hallucinations or the generation of misleading information. That said, it's critical that we exercise caution and validate the outputs produced by any Generative AI-powered system to ensure the accuracy of the results.Section VI: Cleaning UpWe’re not done yet! Cleaning up the resources we’ve created and launched is a very important step as this will help us ensure that we don’t pay for the resources we’re not planning to use.STEP # 01: Once you’re done trying out various types of prompts, feel free to turn off and clean up the resources launched and created using the following lines of code:demo.close() predictor.delete_endpoint()STEP # 02: Make sure to turn off (or delete) the SageMaker Notebook instance as well. I’ll leave this to you as an exercise!Wasn’t that easy?! As you can see, deploying LLMs with Amazon SageMaker is straightforward and easy. Given that Amazon SageMaker handles most of the heavy lifting to manage the infrastructure, we’re able to focus more on the deployment of our machine learning model. We are just scratching the surface as there is a long list of capabilities and features available in SageMaker. If you want to take things to the next level, feel free to read 2 of my books focusing heavily on SageMaker: “Machine Learning with Amazon SageMaker Cookbook” and “Machine Learning Engineering on AWS”.Author BioJoshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO of 3 Australian-owned companies and also served as the Director for Software Development and Engineering for multiple e-commerce startups in the past. Years ago, he and his team won 1st place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and he has been sharing his knowledge in several international conferences to discuss practical strategies on machine learning, engineering, security, and management. He is also the author of the books "Machine Learning with Amazon SageMaker Cookbook", "Machine Learning Engineering on AWS", and "Building and Automating Penetration Testing Labs in the Cloud". Due to his proven track record in leading digital transformation within organizations, he has been recognized as one of the prestigious Orange Boomerang: Digital Leader of the Year 2023 award winners.
Read more
  • 0
  • 0
  • 376

article-image-deploying-llms-with-amazon-sagemaker-part-1
Joshua Arvin Lat
29 Nov 2023
13 min read
Save for later

Deploying LLMs with Amazon SageMaker - Part 1

Joshua Arvin Lat
29 Nov 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionHave you ever tried asking a Generative AI-powered chatbot the question: “What is the meaning of life?”. In case you have not tried that yet, here’s the response I got when I tried that myself using a custom chatbot app I built with a managed machine learning (ML) service called Amazon SageMaker.                                              Image 01 — Asking a chatbot the meaning of lifeYou would be surprised that I built this quick demo application myself in just a few hours! In this post, I will teach you how to deploy your own Large Language Models (LLMs) in a SageMaker Inference Endpoint (that is, a machine learning-powered server that responds to inputs) with just a few lines of code.                                                   Image 02 — Deploying an LLM to a SageMaker Inference EndpointWhile most tutorials available teach us how to utilize existing Application Programming Interfaces (APIs) to prepare chatbot applications, it’s best that we also know how to deploy LLMs in our own servers in order to guarantee data privacy and compliance. In addition to this, we’ll be able to manage the long-term costs of our AI-powered systems as well. One of the most powerful solutions available for these types of requirements is Amazon SageMaker which helps us focus on the work we need to do instead of worrying about cloud infrastructure management.We’ll divide the hands-on portion into the following sections:●  Section I: Preparing the SageMaker Notebook Instance●  Section II: Deploying an LLM using the SageMaker Python SDK to a SageMaker Inference Endpoint●  Section III: Enabling Data Capture with SageMaker Model Monitor (discussed in Part 2)●  Section IV: Invoking the SageMaker inference endpoint using the boto3 client (discussed in Part 2)●  Section V: Preparing a Demo UI for our chatbot application (discussed in Part 2)●  Section VI: Cleaning Up (discussed in Part 2) Without further ado, let’s begin!Section I: Preparing the SageMaker Notebook InstanceLet’s start by creating a SageMaker Notebook instance. Note that while we can also do this in SageMaker Studio, running the example in a Sagemaker Notebook Instance should do the trick. If this is your first time launching a SageMaker Notebook instance, you can think of it as your local machine with several tools pre-installed already where we can run our scripts.STEP # 01: Sign in to your AWS account and navigate to the SageMaker console by typing sagemaker in the search box similar to what we have in the following image:                                                           Image 03 — Navigating to the SageMaker consoleChoose Amazon SageMaker from the list of options available as highlighted in Image 03.STEP # 02: In the sidebar, locate and click Notebook instances under Notebook:                                 Image 04 — Locating Notebook instances in the sidebar STEP # 03: Next, locate and click the Create notebook instance button.STEP # 04: In the Create notebook instance page, you’ll be asked to input a few configuration parameters before we’re able to launch the notebook instance where we’ll be running our code:                                                          Image 05 — Creating a new SageMaker Notebook instanceSpecify a Notebook instance name (for example, llm-demo) and select a Notebook instance type. For best results, you may select a relatively powerful instance type (ml.m4.xlarge) where we will run the scripts. However, you may decide to choose a smaller instance type such as ml.t3.medium (slower but less expensive). Note that we will not be deploying our LLM inside this notebook instance as the model will be deployed in a separate inference endpoint (which will require a more powerful instance type such as an ml.g5.2xlarge).STEP # 05:Create an IAM role by choosing Create a new role from the list of options available in the IAM role dropdown (under Permissions and encryption).                                                                             Image 06 — Opening the Jupyter appThis will open the following popup window. Given that we’re just working on a demo application, the default security configuration should do the trick. Click the Create role button.Important Note: Make sure to have a more secure configuration when dealing with production (or staging) work environments.Won’t dive deep into how cloud security works in this post so feel free to look for other resources and references to further improve the current security setup. In case you are interested to learn more about cloud security, feel free to check my 3rd book “Building and Automating Penetration Testing Labs in the Cloud”. In the 7th Chapter of the book (Setting Up an IAM Privilege Escalation Lab), you’ll learn how misconfigured machine learning environments on AWS can easily be exploited with the right sequence of steps.STEP #06: Click the Create notebook instance button. Wait for about 5-10 minutes for the SageMaker Notebook instance to be ready.Important Note: Given that this will launch a resource that will run until you turn it off (or delete it), make sure to complete all the steps in the 2nd part of this post and clean up the created resources accordingly.STEP # 07:Once the instance is ready, click Open Jupyter similar to what we have in Image 07:                                                                            Image 07 — Opening the Jupyter appThis will open the Jupyter application in a browser tab. If this is your first time using this application, do not worry as detailed instructions will be provided in the succeeding steps to help you get familiar with this tool.STEP # 08:Create a new notebook by clicking New and selecting conda_python3 from the list of options available: Image 08 — Creating a new notebook using the conda_python3 kernelIn case you are wondering about what a kernel is, it is simply an “engine” or “environment” with pre-installed libraries and prerequisites that executes the code specified in the notebook cells. You’ll see this in action in a bit.STEP # 09:At this point, we should see the following interface where we can run various types of scripts and blocks of code:                                                                              Image 09 — New Jupyter notebookFeel free to rename the Jupyter Notebook before proceeding to the next step. If you have not used a Jupyter Notebook before, you may run the following line of code by typing the following in the text field and pressing SHIFT + ENTER. print('hello')This should print the output hello right below the text field where we placed our code.Section II: Deploying an LLM using the SageMaker Python SDK to a SageMaker Inference EndpointSTEP # 01: With everything ready, let’s start by installing a specific version of the SageMaker Python SDK: !pip install sagemaker==2.192.1Here, we’ll be using v2.192.1. This will help us ensure that you won’t encounter breaking changes even if you work on the hands-on solutions in this post at a later date.In case you are wondering what the SageMaker Python SDK is, it is simply a software development kit (SDK) with the set of tools and APIs to help developers interact with and utilize the different features and capabilities of Amazon SageMaker.STEP # 02: Next, let’s import and prepare a few prerequisites by running the following block of code: import sagemaker import time sagemaker_session = sagemaker.Session() region = sagemaker_session.boto_region_name role = sagemaker.get_execution_role()STEP # 03: Let’s import HuggingFaceModel and get_huggingface_llm_image_uri as well:from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uriSTEP # 04: Next, let’s define the generate_random_label() function which we’ll use later when naming our resources:from string import ascii_uppercase from random import choice def generate_random_label():    letters = ascii_uppercase      return ''.join(choice(letters) for i in range(10))This will help us avoid naming conflicts when creating and configuring our resources.STEP # 05: Use the get_huggingface_llm_image_uri function we imported in an earlier step to retrieve the container image URI for our LLM. In addition to this, let’s define the model_name we’ll use later when deploying our LLM to a SageMaker endpoint:image_uri = get_huggingface_llm_image_uri( backend="huggingface", region=region, version="1.1.0" ) model_name = "MistralLite-" + generate_random_label()STEP # 06: Before, we proceed with the actual deployment, let’s quickly inspect what we have in the image_uri variable:image_uriThis will output the following variable value:'763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04'STEP # 07: Similarly, let’s check the variable value of model_name model_nameThis will give us the following:'MistralLite-HKGKFRXURT'Note that you’ll get a different model_name value since we’re randomly generating a portion of the model nameSTEP # 08: Let’s prepare the hub model configuration as well:hub_env = { 'HF_MODEL_ID': 'amazon/MistralLite', 'HF_TASK': 'text-generation', 'SM_NUM_GPUS': '1', "MAX_INPUT_LENGTH": '16000', "MAX_TOTAL_TOKENS": '16384', "MAX_BATCH_PREFILL_TOKENS": '16384', "MAX_BATCH_TOTAL_TOKENS":  '16384', }Here, we specify that we’ll be using the MistralLite model. If this is your first time hearing out MistralLite, it is a fine-tuned Mistral-7B-v0.1 language model. It can perform significantly better on several long context retrieve and answering tasks. For more information, feel free to check: https://huggingface.co/amazon/MistralLite.STEP # 09: Let’s initialize the HuggingFaceModel object using some of the prerequisites and variables we’ve prepared in the earlier steps:model = HuggingFaceModel(    name=model_name,    env=hub_env,    role=role,    image_uri=image_uri )STEP # 10: Now, let’s proceed with the deployment of the model using the deploy() method:predictor = model.deploy( initial_instance_count=1, instance_type="ml.g5.2xlarge", endpoint_name=model_name, )Here, we’re using an ml.g5.2xlarge for our inference endpoint.Given that this step may take about 10-15 minutes to complete, feel free to grab a cup of coffee or tea while waiting!Important Note: Given that this will launch a resource that will run until you turn it off (or delete it), make sure to complete all the steps in the 2nd part of this post and clean up the created resources accordingly.STEP # 11: Now, let’s prepare our first input data:question = "What is the meaning of life?" input_data = { "inputs": f"<|prompter|>{question}</s><|assistant|>", "parameters": {    "do_sample": False,    "max_new_tokens": 2000,    "return_full_text": False, } }STEP # 12: With the prerequisites ready, let’s have our deployed LLM process the input data we prepared in the previous step:result = predictor.predict(input_data)[0]["generated_text"] print(result)This should yield the following output:The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries. There is no single answer that can be definitively proven, as the meaning of life is subjective and can vary greatly from person to person. ...Looks like our SageMaker Inference endpoint (where the LLM is deployed) is working just fine!ConclusionThat wraps up the first part of this post. At this point, you should have a good idea of how to deploy LLMs using Amazon SageMaker. However, there’s more in store for us in the second part as we’ll build on top of what we have already and enable data capture to help us collect and analyze the data (that is, the input requests and output responses) that pass through the inference endpoint. In addition to this, we’ll prepare a demo user interface utilizing the ML model we deployed in this post.If you’re looking for the link to the second part, here it is: Deploying LLMs with Amazon SageMaker - Part 2We are just scratching the surface as there is a long list of capabilities and features available in SageMaker. If you want to take things to the next level, feel free to read 2 of my books focusing heavily on SageMaker: “Machine Learning with Amazon SageMaker Cookbook” and “Machine Learning Engineering on AWS”.Author BioJoshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO of 3 Australian-owned companies and also served as the Director for Software Development and Engineering for multiple e-commerce startups in the past. Years ago, he and his team won 1st place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and he has been sharing his knowledge in several international conferences to discuss practical strategies on machine learning, engineering, security, and management. He is also the author of the books "Machine Learning with Amazon SageMaker Cookbook", "Machine Learning Engineering on AWS", and "Building and Automating Penetration Testing Labs in the Cloud". Due to his proven track record in leading digital transformation within organizations, he has been recognized as one of the prestigious Orange Boomerang: Digital Leader of the Year 2023 award winners.
Read more
  • 0
  • 0
  • 991
article-image-ai-distilled-27-ai-breakthroughs-open-source-pioneers
Merlyn Shelley
24 Nov 2023
13 min read
Save for later

AI_Distilled #27: AI Breakthroughs & Open-Source Pioneers

Merlyn Shelley
24 Nov 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,Welcome to another AI_Distilled! This edition brings you key stories on AI, ML, NLP, Gen AI, and more. Our mission is to keep you informed, empowering your skill advancement. Before we embark on the need-to-know updates, let’s take a moment to observe an important perspective from an industry leader. “We’re now seeing a major second wave…let’s acknowledge that without open source, how would AI have made the tremendous progress it has over the last decade” -Jensen Huang, NVIDIA CEO Amidst the uncertainty surrounding Sam Altman's removal and reinstatement at OpenAI, the open-source community emerges as a potential beneficiary. Also, as OpenAI pauses new signups for ChatGPT Plus, enterprises are anticipated to seek stability and long-term impact by turning to open-source AI models such as Llama, Mistral, Falcon, and MPT for their AI application development needs. Both proprietary and open-source models will play roles, but the latter's contributions are crucial for advancing AI technology's impact on work and life. In this week’s edition, we’ll talk about Google DeepMind unveiling an advanced AI music generation model and experiments, Meta releasing Emu Video and Emu Edit, major breakthroughs in generative AI research, Microsoft Ignite 2023 bringing new AI expansions and product announcements, and Galileo's Hallucination Index identifying GPT-4 as the best LLM for different use cases. We’ve also got you your fresh dose of AI secret knowledge and tutorials including how to implement emerging practices for society-centered AI, how to speed up and improve LLM output with skeleton-of-thought, getting started with Llama 2 in 5 steps, and how to build an AI assistant with real-time web access in 100 lines of code using Python and GPT-4. Also, don't forget to check our expert insights column, which covers the interesting concepts of data architecture from the book 'Modern Data Architecture on AWS'. It's a must-read!  Stay curious and gear up for an intellectually enriching experience!  📥 Feedback on the Weekly EditionHey folks!After the stunning OpenAI DevDay, many of us were eager to embark on creating our custom GPT magic. But let's chat about the recent hiccups: the pause on ChatGPT-4 new sign-ups and the shift or reformation in OpenAI's leadership. It's got us all wondering about the future of our handy tools. Quick question: Ever tried ChatGPT's Advanced Data Analysis? Now that it's temporarily on hold for new users, it's got us thinking, right? Share your take on these changes in the comments. Your thoughts count! We're turning the spotlight on you – some of the best insights will be featured in our next issue for our 38K-strong AI-focused community. Don't miss out on the chance to share your views! 🗨️✨ As a big thanks, get our bestselling "The Applied Artificial Intelligence Workshop" in PDF.   Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🔳 Sam Altman Is Reinstated as OpenAI’s Chief Executive: OpenAI reinstated CEO Sam Altman, reversing his ouster amid a board shake-up. The revamped board, led by Bret Taylor, includes Lawrence Summers and Adam D'Angelo, with Microsoft's support. Negotiations involved concessions, including an independent investigation into Altman's leadership. Some outgoing members sought to curb Altman's power. Altman's removal sparked a campaign by allies and employees for his return. The board initially stood by its decision but ultimately reinstated Altman for a fresh start. 🔳 Google DeepMind Unveils Advanced AI Music Generation Model and Experiments: Google DeepMind introduces Lyria, an advanced AI music generation model, and collaborates with YouTube on two experiments, "Dream Track" and "Music AI tools," revolutionizing music creation. Lyria excels in maintaining musical continuity, while the experiments support artists and producers in crafting unique soundtracks and enhancing the creative process. 🔳 Meta Unveils Emu Video and Emu Edit: Advancements in Generative AI Research: Meta has unveiled two major advancements in generative AI: Emu Video, a text-to-video platform using diffusion models for high-quality content generation, and Emu Edit, an image editing tool for precise control. Human evaluations favor Emu Video over previous models, showcasing substantial progress in creative and effective generative AI tools. 🔳 Google's AI Search Feature Expands to 120+ Countries: Google's Search Generative Experience (SGE) has expanded to 120+ countries, offering generative AI summaries and language support for Spanish, Portuguese, Korean, and Indonesian. Users can ask follow-up questions and get interactive definitions. The update will initially roll out in the US before expanding globally, enhancing natural language interactions in search results. 🔳 Microsoft Ignite 2023 Brings New AI Expansions and Product Announcements: Microsoft's Ignite 2023 highlighted the company's deepened AI commitment, featuring Bing Chat's rebranding to Copilot, custom AI chips, and new AI tools like Copilot for Azure. Microsoft Teams will offer AI-driven home decoration and voice isolation. The company consolidated planning tools, introduced generative AI copyright protection, Windows AI Studio for local AI deployment, and Azure AI Speech for text-to-speech avatars. The event underscored Microsoft's emphasis on AI integration across its products and services. 🔳 Microsoft Emerges as Ultimate Winner in OpenAI Power Struggle: Microsoft emerged victorious in the OpenAI power struggle by hiring ousted CEO Sam Altman and key staff, including Greg Brockman, to lead a new advanced AI team. This strategic move solidifies Microsoft's dominance in the industry, positioning it as a major player in AI without acquiring OpenAI, valued at $86 billion. The recent turmoil at OpenAI has led to employee threats of quitting and joining Altman at Microsoft, potentially granting Microsoft access to significant AI talent. 🔳 Galileo's Hallucination Index Identifies GPT-4 As the Best LLM for Different Use Cases: San Francisco based Galileo has introduced a Hallucination Index to aid users in selecting the most reliable Large Language Models (LLMs) for specific tasks. Evaluating various LLMs, including Meta's Llama series, the index found GPT-4 excelled, and OpenAI's models consistently performed well, supporting trustworthy GenAI applications. 🔳 Microsoft Releases Orca 2: Small Language Models That Outperform Larger Ones: Orca 2, comprising 7 billion and 13 billion parameter models, excels in intricate reasoning tasks, surpassing larger counterparts. Developed by fine-tuning LLAMA 2 base models on tailored synthetic data, Orca 2 showcases advancements in smaller language model research, demonstrating adaptability across tasks like reasoning, grounding, and safety through post-training with carefully filtered synthetic data. 🔳 NVIDIA CEO Predicts Major Second Wave of AI: Jensen Huang predicts a significant AI surge, citing breakthroughs in language replicated in biology, manufacturing, and robotics, offering substantial opportunities for Europe. Praising France's AI leadership, he emphasizes the importance of region-specific AI systems reflecting cultural nuances and highlights the crucial role of data in regional AI growth. 🔮 Expert Insights from Packt Community Modern Data Architecture on AWS - By Behram Irani Challenges with on-premises data systems As data grew exponentially, so did the on-premises systems. However, visible cracks started to appear in the legacy way of architecting data and analytics use cases. The hardware that was used to process, store, and consume data had to be procured up-front, and then installed and configured before it was ready for use. So, there was operational overhead and risks associated with procuring the hardware, provisioning it, installing software, and maintaining the system all the time. Also, to accommodate for future data growth, people had to estimate additional capacity way in advance. The concept of hardware elasticity didn’t exist.The lack of elasticity in hardware meant that there were scalability risks associated with the systems in place, and these risks would surface whenever there was a sudden growth in the volume of data or when there was a market expansion for the business. Buying all this extra hardware up-front also meant that a huge capital expenditure investment had to be made for the hardware, with all the extra capacity lying unused from time to time. Also, software licenses had to be paid for and those were expensive, adding to the overall IT costs. Even after buying all the hardware upfront, it was difficult to maintain the data platform’s high performance all the time. As data volumes grew, latency started creeping in, which adversely affected the performance of certain critical systems. As data grow into big data, the type of data produced was not just structured data; a lot of business use cases required semi-structured data, such as JSON files, and even unstructured data, such as images and PDF files. In subsequent chapters, we will go through some use cases that specify different types of data. As the sources of data grew, so did the number of ETL pipelines. Managing these pipelines became cumbersome. And on top of that, with so much data movement, data started to duplicate at multiple places, which made it difficult to create a single source of truth for the data. On the flip side, with so many data sources and data owners within an organization, data became siloed, which made it difficult to share across different LOBs in the organization. This content is from the book “Modern Data Architecture on AWS” writtern by Behram Irani (Aug 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🤖 How to Use Amazon CodeWhisperer for Command Line: Amazon introduces Amazon CodeWhisperer for the command line, enhancing developer productivity with contextual CLI completions and AI-driven natural language-to-bash translation. The tool provides CLI completions and translates natural language instructions into executable shell code snippets, modernizing the command line experience for over thirty million engineers. 🤖 How to Implement Emerging Practices for Society-Centered AI: The post underscores the importance of AI professionals addressing societal implications, advocating for multidisciplinary collaboration. It stresses the significance of measuring AI's impact on society to enhance effectiveness and identify areas for improvement in developing systems that benefit the broader community. 🤖 How to Speed Up and Improve LLM Output with Skeleton-of-Thought: The article introduces the Skeleton-of-Thought (SoT) approach, aiming to enhance the efficiency of Language Models (LLMs) by reducing generation latency and improving answer quality. SoT guides LLMs to generate answer skeletons first, then completes them in parallel, potentially accelerating open-source and API-based models for various question categories. 🤖 Understanding SuperNIC to Enhance AI Efficiency: The BlueField-3 SuperNIC is pivotal in AI-driven innovation, boosting workload efficiency and networking speed in AI cloud computing. With a 1:1 GPU to SuperNIC ratio, it enhances productivity. Integrated with NVIDIA Spectrum-4, it provides adaptive routing, out-of-order packet handling, and optimized congestion control for superior outcomes in enterprise data centers. 🤖 Step-by-step guide to the Evolution of LLMs: The post explores the 12-month evolution of Large Language Models (LLMs), from text completion to dynamic chatbots with code execution and knowledge access. It emphasizes the frequent release of new features, models, and techniques, notably the November 2022 launch of ChatGPT, accelerating user adoption and triggering an AI arms race, while questioning if such rapid advancements are bringing us closer to practical AI agents.  🔛 Masterclass: AI/LLM Tutorials👉 How to Get Started with Llama 2 in 5 Steps: Llama 2, an open-source large language model, is now free for research and commercial use. This blog outlines a five-step guide, covering prerequisites, model setup, fine-tuning, inference, and additional resources for users interested in utilizing Llama 2. 👉 How to Integrate GPT-4 with Python and Java: A Developer's Guide: The article explores integrating GPT-4 with Python and Java, emphasizing Python's compatibility and flexibility. It provides examples, discusses challenges like rate limits, and encourages collaboration for harnessing GPT-4's transformative potential, highlighting the importance of patience and debugging skills. 👉 How to Build an AI Assistant with Real-Time Web Access in 100 Lines of Code Using Python and GPT-4: This article guides readers in creating a Python-based AI assistant with real-time web access using GPT-4 in just 100 lines of code. The process involves initializing clients with API keys, creating the assistant using the OpenAI and Tavily libraries, and implementing a function for retrieving real-time information from the web. The author offers a detailed step-by-step guide with code snippets. 👉 Step-by-step guide to building a real-time recommendation engine with Amazon MSK and Rockset: This tutorial demonstrates building a real-time product recommendation engine using Amazon Managed Streaming for Apache Kafka (Amazon MSK) and Rockset. The architecture allows instant, personalized recommendations critical for e-commerce, utilizing Amazon MSK for capturing high-velocity user data and AWS Managed services for scalability in handling customer requests, API invocations, and data ingestion.  🚀 HackHub: Trending AI Tools💮 protectai/ai-exploits: Collection of real-world AI/ML exploits for responsibly disclosed vulnerabilities, aiming to raise awareness of the amount of vulnerable components in the AI/ML ecosystem. 💮 nlmatics/llmsherpa: Provides strategic APIs to accelerate LLM use cases, includes a LayoutPDFReader that provides layout information for PDF to text parsers, and is tested on a wide variety of PDFs. 💮 QwenLM/Qwen-Audio: Large audio language model proposed by Alibaba Cloud developers can use for speech editing, sound understanding and reasoning, music appreciation, and multi-turn dialogues in diverse audio-oriented scenarios. 💮 langchain-ai/opengpts: Open-source effort creating a similar experience to OpenAI's GPTs and Assistants API. It builds upon LangChain, LangServe, and LangSmith.  Readers’ Feedback! 💬 💭 Anish says, "The growing number of subscribers is really exciting. I particularly appreciate the transformation of 2D images into 3D models from Adobe and going through 'Tackling Hallucinations in LLMs' by Bijit Ghosh. These kinds of practical contexts are truly my preference for the upcoming newsletters." 💭 Tony says, "Very informative, far-reaching, and extremely timely. On point. Just keep it up, keep your eye on news and knowledge, and keep cluing us all once a week please, Merlyn. You're doing a fine job."  Share your thoughts here! Your opinions matter—let's make this space a reflection of diverse perspectives. 
Read more
  • 0
  • 0
  • 162

article-image-debugging-and-improving-code-with-chatgpt
Dan MacLean
23 Nov 2023
8 min read
Save for later

Debugging and Improving Code with ChatGPT

Dan MacLean
23 Nov 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, R Bioinformatics Cookbook - Second Edition, by Dan MacLean. Discover over 80 recipes for modeling and handling real-life biological data using modern libraries from the R ecosystem.IntroductionEmbrace the power of streamlined debugging and code refinement with ChatGPT's expertise. Unravel the possibilities of effortlessly troubleshooting errors, optimizing performance, and refining code structures. This article explores how ChatGPT, armed with its extensive programming knowledge, assists in identifying and rectifying coding errors, offering tailored solutions, and fostering a deeper understanding of code logic. Dive into a journey of code enhancement, where ChatGPT becomes your indispensable companion in navigating complexities and elevating your coding experiencesDebugging and improving code with ChatGPTChatGPT is capable of debugging code due to its extensive training on programming-related text. It has acquired knowledge of common programming patterns and errors, allowing it to understand code snippets, identify potential issues, and provide suggestions or improvements. It can identify minor errors by highlighting syntax issues such as missing brackets, incorrect function calls, and invalid variable assignments. ChatGPT helps troubleshooting by asking clarifying questions to better understand the problem and guide developers toward the root cause, and can aid in code optimization by suggesting efficiency improvements, identifying performance bottlenecks, and proposing more optimized implementations. Lastly, the debugging process with ChatGPT can foster learning and exploration, enabling developers to discover new concepts, programming techniques, or functionalities of specific libraries or packages. In this recipe, we’ll walk through an example debugging session with ChatGPT and look at how it helps us to reflect better on our own code.Getting readyWe just need an account at ChatGPT: https://chat.openai.com/.How to do it…We can debug some code as follows:1. Set up some code that we expect to work and try it:data <- data.frame( predictor = c(1, 2, 3, 4, NA), response = c("A", "B", "C", "D", "E") ) model <- lm(response ~ predictor, data = data)2. Send the broken code to ChatGPT:Help me fix the following code: data <- data.frame( predictor = c(1, 2, 3, 4, NA), response = c("A", "B", "C", "D", "E") ) model <- lm(response ~ predictor, data = data)3. Read and run the resulting response in R:To fix the code, you need to handle the missing value (NA) in the predictor column. Here's the corrected code: # Fixing the code data <- data.frame( predictor = c(1, 2, 3, 4, NA), response = c("A", "B", "C", "D", "E") ) # Replace NA in the predictor column with the mean value data$predictor[is.na(data$predictor)] <- mean(data$predictor, na.rm = TRUE) # Convert the response variable to a factor data$response <- as.factor(data$response) # Perform linear regression model <- lm(response ~ predictor, data = data) summary(model) In the fixed code, I replaced the NA value in the predictor column with the mean value of the remaining numeric values using the mean() function with na.rm = TRUE argument. Then, I converted the response variable to a factor using the as.factor() function. Finally, the linear regression model is computed with the corrected data, and the summary(model) displays the model summary.4. Tell ChatGPT the code still doesn’t work:It still doesn't work - Im getting thie error: Error in as.character.factor(x) : malformed factor In addition: Warning message: In Ops.factor(r, 2) : '^' not meaningful for factors 5. Read the response and run the code:Apologies for the oversight. The error is occurring because the ^ operator is not applicable to factors in the formula used in the linear regression. To fix this, we can convert the factor to a numeric representation using numeric codes or dummy variables. Here's the updated code: # Fixing the code data <- data.frame( predictor = c(1, 2, 3, 4, NA), response = c("A", "B", "C", "D", "E") ) # Replace NA in the predictor column with the mean value data$predictor[is.na(data$predictor)] <- mean(data$predictor, na.rm = TRUE) # Convert the response variable to a factor data$response <- as.factor(data$response) # Convert the factor to numeric codes data$response <- as.numeric(data$response) # Perform linear regression model <- lm(response ~ predictor, data = data) summary(model) In the updated code, I've used as.numeric() to convert the response factor variable to numeric codes. This allows the regression model to treat it as a numerical response. Now, the lm() function should run without the error, and summary(model) will display the model summary.6. Ask it to check whether the original code was appropriately analyzed:Were the original data suitable for a linear model?7. Read the response:No, the original data provided in the code were not suitable for a linear model. Here's why: 1. The `predictor` variable was numeric, which is suitable for a linear model. 2. The `response` variable, however, was categorical with values "A", "B", "C", "D", and "E". Categorical variables cannot be directly used as the response variable in a linear regression model because they do not represent continuous numeric values. To use a categorical variable as the response variable in a linear model, you typically need to encode it into dummy variables or use techniques like multinomial logistic regression. Dummy variable encoding represents each category as a separate binary variable (0 or 1). Once the categorical variable is properly encoded, it can be used as the response variable in a linear regression model.Again, that’s how the process goes though the output can be different from user to user but the meaning should be the same.How it works…In step 1, we set up a data frame of some data as a test set to use in a linear model; we wish to examine the relationship between the two variables so assume that a linear model will work. When we run the code in R, we get this:Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y' In addition: Warning message: In storage.mode(v) <- "double" : NAs introduced by coercion This is a typically confounded R error message. We want help, so in step 2, we ask ChatGPT to fix the code.Step 3 shows us ChatGPT’s response, which suggests fixing the NA values that are in the predictor column. That seems reasonable, and, as it explains, ChatGPT gives us some code that imputes a new value from the mean of all the other values – again, a reasonable value to impute. When we run the code, it still doesn’t work and we get a new error, so in step 4, we tell ChatGPT about it and ask it to fix the new errors.In step 5, we see an apologetic language model attempt to correct the error. It gives us a confusing reason for doing some strange text/number conversion and fixed code. When we run this new code in the console, we get output like this:## Call: ## lm(formula = response ~ predictor, data = data) ## ## Residuals: ## 1 2 3 4 5 ## -0.5 -0.5 -0.5 -0.5 2.0 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.5000 1.5546 0.322 0.769 ## predictor 1.0000 0.5774 1.732 0.182 ## ## Residual standard error: 1.291 on 3 degrees of freedom ## Multiple R-squared: 0.5, Adjusted R-squared: 0.3333 ## F-statistic: 3 on 1 and 3 DF, p-value: 0.1817This looks a bit strange – the residuals are weird and the rest of the values look poor. We start to question whether this was the right thing to do in the first place.In step 6, we ask ChatGPT whether the linear model was the right sort of analysis. It responds as in step 7, telling us quite clearly that it was not appropriate.This recipe highlights that we can use ChatGPT to fix code that doesn’t work, but shows also that ChatGPT will not reason without prompting. Here, it let us pursue a piece of code that wasn’t right for the task. As a language model, it can’t know that, even though we believe it would be evident from the question setup. It didn’t try to correct our flawed assumptions or logic. We still need to be responsible for the logic and applicability of our code.ConclusionIn conclusion, ChatGPT emerges as a valuable ally in code debugging, offering insightful solutions and guiding developers toward efficient, error-free code. While it excels in identifying issues and suggesting fixes, it's crucial to recognize that ChatGPT operates within the boundaries of provided information. This journey underscores the importance of critical thinking in code analysis, reminding us that while ChatGPT empowers us with solutions, the responsibility for logical correctness and code applicability ultimately rests on the developer. Leveraging ChatGPT's expertise alongside mindful coding practices paves the way for seamless debugging and continual improvement in software development endeavors.Author BioProfessor Dan MacLean has a Ph.D. in molecular biology from the University of Cambridge and gained postdoctoral experience in genomics and bioinformatics at Stanford University in California. Dan is now Head of Bioinformatics at the world-leading Sainsbury Laboratory in Norwich, UK where he works on bioinformatics, genomics, and machine learning. He teaches undergraduates, post-graduates, and post-doctoral students in data science and computational biology. His research group has developed numerous new methods and software in R, Python, and other languages with over 100,000 downloads combined.
Read more
  • 0
  • 0
  • 618