Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials - LLM

79 Articles
article-image-how-we-are-thinking-about-generative-ai
Packt
18 Jul 2024
10 min read
Save for later

How we are Thinking About Generative AI

Packt
18 Jul 2024
10 min read
How we are Thinking About Generative AI for Developers and Tech LearningPackt is a global tech publisher serving developers and tech professionals (TechPros). Over the last 20 years, we have published over 8,000 books and videos, gaining deep insights into the evolving challenges tech professionals face. Recently, the rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape, affecting everyone from software developers to business strategists.The rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape.The rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape. These changes affect everyone from software developers to business strategists. The tech industry is at a critical inflection point with technology use, development, and education. At Packt, we are actively exploring generative AI's impact on the industry and TechPros' daily work and learning. Here, we outline our thoughts on how GenAI reshapes professional activities and tech learning, and our strategic responses to it. We would love to hear your feedback on this document and your thoughts on the issues raised within it. Please do send any comments to: GenAI_feedback@packt.com. The Impact of GenAI on TechPro WorkThe rapid pace of advancement in Generative AI makes it difficult to predict, but we believe, on balance, that it is a force for good in software development. A core Packt value that we share with our TechPro users is a belief in and commitment to the power of technology for progress. Our default setting is to get on board with change.GenAI is already changing the nature of many development jobs, but it will not mean the end of software development. We are fundamentally optimistic about the future for TechPros powered by GenAI. It will mean more, faster, better work.This is how we at Packt see these changes: Increased Software ProductionHumanity continuously evolves, adapts, and advances, maintaining a need for more sophisticated software solutions – whether those are built on traditional software platforms or on top of AI models themselves. GenAI is already transforming the economics of supply by making engineers more productive and enabling more engineering tasks. The demand for more, better software will remain, leading to an increase in the number of professionals building, designing, adapting, and managing software. Shifts in Software DevelopmentMuch of what engineers spend time doing can be quite generic. GenAI is beginning to automate these middle-tier, routine activities, allowing developers to focus on higher-value, more creative tasks. This shift redistributes work in three dimensions from the center of the development stack. Work moves ‘up the stack’ into architecture, domain expertise, and design, ‘down the stack’ into complex algorithm development, infrastructure, and tooling, and outwards to the edges with specific integrations and implementations. To meet the increased demand for software, there will be significantly more designers and implementors at those development edges, with increasing business and domain focus and specialization. There will be a continuously hard-to-meet need for deep tech engineers building the tools and infrastructure that enable this automation to operate efficiently at scale and speed. This will be seen at the hardware and firmware level as well as operating systems, cloud platforms, and the models and algorithms that modern software is built upon. Increased Domain and Business SpecializationAs GenAI moves tasks from generic operations upwards and outwards to more specialized domains, engineers will increasingly make decisions that require greater judgment and domain expertise. This will lead to a greater focus on domain experience and knowledge, and a higher value on business relationships.GenAI also democratizes the development and management of systems, making these processes accessible to more users and transforming many jobs from direct task execution to overseeing AI agents that perform the work. This evolution could significantly expand the roles involving aspects of software design or delivery. Impact on Tech Pro LearningGenAI integrates automation and problem solving, leading to profound change in how TechPros learn and solve problems. We see the core changes as being:Shift Toward Just-In-Time (JIT) Continuous LearningDevelopers have always preferred to learn by doing—starting work and solving problems on the fly. GenAI makes this the only viable approach. The ROI of upfront Just-In-Case (JIC) learning, where developers research technologies that might be useful in future, declines when co-pilots can accelerate initial builds and troubleshoot during development. GenAI tools can escalate to rapid Just-in-Time [JIT] learning sprints to backfill knowledge gaps as they are discovered.GenAI tools can help engineers to rapidly understand and work on existing complex and often undocumented code bases, again backfilling knowledge gaps JIT. Entry Level Learning Moves to Simulated EnvironmentsThe JIT learning-by-doing model also applies to students and juniors, but the study work they do will be “as good as real.” Traditional, linear courseware will be replaced by personalized, hands-on projects in rich simulated environments. These environments provide shorter, contextual learning experiences that effectively bridge the gap between theory and practice, reducing the training load on increasingly busy senior developers. Growth in Demand for Real World Experience and Peer InteractionAs development increasingly moves up the stack and routine tasks are automated, there is a growing need for TechPros to understand specific real-world applications of systems and solutions. Highly specific, detailed, and objective case studies with high relevance to a specific problem area and technical solution will become increasingly valuable. Demand for discussion and interaction with experienced fellow professionals to share knowledge and insights will also grow. Such authentic content not only aids learning but also enhances the training of AI models. Authoritative and Expert Insight Remains KeyDespite the shift towards more automated and JIT learning approaches, a thorough understanding of core concepts remains crucial. Books will continue to be one of the most powerful and authoritative ways for technology originators to share their foundational knowledge. This will remain the key long-term use-case for tech books. Continuing Need for Creator Trust and AuthenticityGen AI enables the rapid creation of written work. In the tech publishing domain, we estimate that up to around 50% of titles in certain categories on Amazon might already be AI-generated or derived. This AI content meets certain user needs, and this proliferation will continue across store platforms. We believe that human-generated work fulfils a different user need and that there will always be value in authentic creator insight and expertise. We continue to build direct relationships with tech professionals and authors to create and publish this content. The Future is UncertainHow this evolves is hard to know. The pace of change both in the technology and in the landscape around it has surfaced issues with reliability, compliance, cost, and memory/reasoning limitations. GenAI technology is moving extremely fast but has serious technical challenges.  GenAI technology is moving extremely fast but has serious technical challenges.These issues will be resolved over time, but they limit the pace of actual deployment. A Cautious Approach to ChangeThe case for changing existing systems, practices, and organizational models should be approached with caution. Enterprises have a high bar for adopting core systems and the deployment phase will be long and require detailed work. Uncertainty in Computing PlatformsIt remains uncertain whether GenAI might evolve into the dominant general purpose computing platform or how it will evolve past the current transformer architecture. It may become a ubiquitous implementation layer for all services over time; we do not know. However, we share the view that this is a pivotal phase for technology and for humanity. A Mixed Economy of the Old and the NewWe see a long phase of a mixed economy of old methods and new GenAI tools. There will be pockets of rapid adoption of GenAI tooling, like we see in coding co-pilots and in application areas, such as customer service agents. However, with every deployment there will be a lot of “old style” engineering: problem solving, integrations, QA, optimization. The shifts to high level working will be gradual and not immediately noticeable. Friction in Human SystemsHuman systems inherently resist change. Individuals stick with working and learning systems with which they are comfortable. Teaching methods evolve slowly, and we see different generations working and learning in different ways. While a shift toward Just-In-Time (JIT) learning is underway, structured, long-form learning will continue to play a crucial role. Rapid Adoption Among DevelopersThe pace at which individual developers have adopted co-pilots and are using GenAI for problem solving is striking. We expect these trends of grassroots, individual adoption to continue and accelerate. How Packt is RespondingThe insights gained from talking with TechPros combined with our thinking about the impact of GenAI on TechPro work and learning has resulted in these strategic initiatives:Shift to the Edges of the Development Stack in PublishingWe are pioneering new approaches to developing and publishing real world practical case studies to answer the crucial questions: “What are people actually building with this right now?” and, “How are they actually doing it?”What are people actually building with this right now? How are they actually doing it?We will increase our focus on publishing specific, definitive, deep, technical books from the creators and builders of new technology to help TechPros broaden their skills across the development stack. We will continue to build the tech book canon in the era of GenAI.License for LLM Training ResponsiblyThe uniquely high-quality content tech authors create has immense value for LLM training. We want to support the evolution of this technology while developing model training as a potentially valuable new channel for published content.We want authors to get fair value and the recognition they are due, and we will pursue all agreements with partners in a pragmatic but principled way. Use GenAI to Enable a Step Change in Content Engineering and Derived WorksGenAI tools and automations can reduce the cost and effort of keeping a title up to date as technology evolves, and of creating a rich portfolio of derived works from the initial content. We call this BODE: Build Once, Deploy Everywhere.We are exploring exciting use-cases to increase the value of the original work, and its reach into new platforms, formats, languages, and versions. Build Packt Models and Explore JITWe have already delivered experimental AI agents fine-tuned on specific Packt titles. We are expanding this to topic, role, and whole-library models. We are exploring integration of the Packt corpus into co-pilots and tools to deliver workflow-embedded JIT knowledge and learning escalation. Build Professional MembershipsRecognizing the increased value of live interactions in a post-GenAI world, we are committed to enabling Tech Professionals to engage in high-quality, trustworthy interactions with peers working on similar roles and projects.Thoughts? Feedback?Please send any comments to:GenAI_feedback@packt.com
Read more
  • 3
  • 0
  • 771

article-image-elevate-your-llm-mastery
Merlyn Shelley
18 Apr 2024
13 min read
Save for later

Elevate Your LLM Mastery

Merlyn Shelley
18 Apr 2024
13 min read
Subscribe to our Data Pro newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,🚀 Welcome to DataPro Newsletter #84!  Dive into the dynamic world of data science and AI, where breakthroughs and trends shape our future.   🔍 Highlights:  Google's Genie   Meta AI's Priority Sampling   DeepMind's Hawk and Griffin   CMU's OmniACT   Qualcomm's GPTVQ   Azure PyRIT   Microsoft's ChunkAttention   ✨ Data Community Blogs:  ML Workflow with Scikit-learn Pipelines   Text Embeddings   AI System Design   Mixture of Thought LLM Cascades   GNN with Pytorch Implementation  Vertex AI MLOps Platform   🏭 Industry Updates:  Anthropic’s Claude 3 Sonnet in Amazon Bedrock    Anthropic’s Claude 3 models in Vertex AI    Microsoft’s Orca-Math   Table Meets LLM  OpenAI and Elon Musk   📚 New in Packt Library:  "Building AI Applications with ChatGPT APIs" by Martin Yanev   DataPro Newsletter is not just a publication; it’s a comprehensive toolkit for anyone serious about mastering the ever-changing landscape of data and AI. Grab your copy and start transforming your data expertise today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."We appreciate your input and hope you enjoy the book!Share your Feedback!Cheers,Merlyn ShelleyEditor-in-Chief, Packt Sign Up | Advertise | Archives🔰 GitHub Finds: Any of These Repos in Your Toolbox?🛠️ VAST-AI-Research/TripoSR: TripoSR, developed by Tripo AI and Stability AI, is an open-source model for fast 3D reconstruction from a single image. It outperforms others in speed and quality, generating 3D models in under 0.5 seconds on NVIDIA A100 GPUs. 🛠️ facebookresearch/ViewDiff: ViewDiff creates consistent, high-quality images of 3D objects in real-world settings from multiple angles. 🛠️ YubiaoYue/MedMamba: MedMamba, inspired by visual state space models, sets a new baseline for medical image classification, excelling across diverse datasets. 🛠️ BAAI-Agents/Cradle: Cradle framework pioneers General Computer Control, enhancing agent capabilities for any task through reasoning and self-improvement. 📚 Expert Insights from Packt CommunityBuilding AI Applications with ChatGPT APIs - By Martin Yanev Setting Up the Code Bug Fixer Project Open PyCharm: Double-click on the PyCharm icon on your desktop or search for it in your applications folder to open it. On the PyCharm welcome screen, click on Create New Project or go to File | New Project. Choose the directory where you want to save your project. You can either create a new directory or select an existing one. Select the Python interpreter: Choose the version of Python you want to use for your project. Configure project settings: Give your project the name CodeBugFixer, and choose a project location. Once you’ve configured all the settings, click Create to create your new PyCharm project. After creating a new PyCharm project, the next step is to create the necessary files and folders for the CodeBugFixer project. Firstly, create two new Python files, called app.py and config.py, in the root directory of the project. The app.py file is where the main code for the CodeBugFixer app will be written, and the config.py file will contain any sensitive information such as API keys and passwords. Next, create a new folder called templates in the root directory of the project. This folder will contain the HTML templates that the Flask app will render. Inside the templates folder, create a new file called index.html. This file will contain the HTML code for the home page of the CodeBugFixer app. The project structure should look like the following: CodeBugFixer/ ├── config.py ├── app.py ├── templates/ │   └── index.html By following these steps, you have created the necessary files and folders for your CodeBugFixer project in your PyCharm project. You can now start writing the code for your Flask app in the app.py file and the HTML code in the index.html file. Once you have the correct interpreter, you can open the terminal within PyCharm by going to View | Tool Windows | Terminal. Check your terminal and ensure that you can see the (venv) indicator to confirm that you are working within your virtual environment. This is an essential step to prevent conflicting package installations between projects and guarantee that you are using the correct set of dependencies. In the terminal window, you can install any necessary libraries as follows: (venv)$ pip install flask (venv)$ pip install openai Finally, in order to establish the foundation for utilizing the ChatGPT API in your CodeBugFixer app, you’ll need to add the following code to config.py and app.py: config.py API_KEY = <Your API Key> app.py from flask import Flask, request, render_template import openai import config app = Flask(__name__) # API Token openai.api_key = config.API_KEY @app.route("/") def index():     return render_template("index.html") if __name__ == "__main__":     app.run() The config.py file will securely hold your OpenAI API key. Make sure to replace <Your API Key> with the actual API key that you obtained from OpenAI. Discover more insights from 'Building AI Applications with ChatGPT APIs' by Martin Yanev. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today! Read Here!Message from our Partners!👉 Octane AI Insights Analyst: Explore how Octane AI is revolutionizing ecommerce. Over 3,000 Shopify merchants have harnessed AI Quiz Funnels and Insights, generating over $500 million in revenue. It's more than growth; it's understanding and engaging customers on a new level. Join the community and see the difference.  👉 Cognism: Transform your sales strategy with Cognism. Experience a 3x boost in connect rate, gain access to verified B2B contacts, and enjoy seamless integration with your CRM tools. Expand globally with our comprehensive data coverage. Streamline your outreach for better conversions. 👉 Freshdesk: Revolutionize your customer service with Freshworks Smart Suite's focus on analytics. Unlock actionable insights, anticipate needs, and streamline support through AI-driven dashboard. Empower your team with the tools to excel in efficiency and personalization. Start with a free trial and transform your service today! 👉 Murf AI: Enhance your projects with Murf's AI-powered voices, offering a range of realistic options for any use case. From corporate presentations to entertainment, find the perfect voice in over 20 languages. With Murf Studio, seamlessly integrate voice with your videos, music, or images, bringing your creative vision to life. Start your free trial and experience the difference. Thanks for reading Packt DataPro! Subscribe for free to receive new posts and support my work.⚡ Tech Tidbits: Stay Wired to the Latest Industry Buzz! AWS ML Made Easy 🌀 Anthropic’s Claude 3 Sonnet foundation model is now available in Amazon Bedrock: Amazon announced a collaboration with Anthropic to accelerate the development of Claude foundation models, making them accessible to AWS customers. Recently, Claude 3 was introduced, offering three models with varying levels of intelligence, speed, and cost. Claude 3 Sonnet is now available in Amazon Bedrock, providing faster speeds, increased steerability, and image-to-text vision capabilities. Mastering ML with Google 🌀 Announcing Anthropic’s Claude 3 models in Google Cloud Vertex AI: Google Cloud is enhancing customer choice and innovation in Vertex AI with the addition of Anthropic's Claude 3, a new family of state-of-the-art AI models. These models, optimized for various enterprise applications, include the highly capable Claude 3 Opus, the balanced Claude 3 Sonnet, and the fast, compact Claude 3 Haiku. Customers can soon access all three models via API in Vertex AI Model Garden, starting with private preview access to Claude 3 Sonnet. The Claude 3 models offer improved reasoning, content creation, language fluency, and vision capabilities, enabling customers to focus on applications while benefiting from flexible scaling, cost optimization, and Google Cloud's security and compliance. Microsoft Research Insights🌀 Orca-Math: Demonstrating the potential of SLMs with model specialization. The study on Orca and Orca 2 demonstrated how improved training methods can enhance the reasoning abilities of smaller language models, bringing them closer to larger models. Orca-Math, a 7 billion parameter model, specializes in solving math problems and outperforms larger models in this area. The research highlights the value of smaller models in specialized tasks and the potential of continual learning. The dataset and training procedure are available for further research. 🌀 Table Meets LLM: Improving LLM understanding of structured data and exploring advanced prompting methods: This paper explores how large language models (LLMs) understand structured table data. It investigates effective prompts, inherent structured data detection, leveraging existing knowledge, and trade-offs among input designs for better understanding and utilization of table-based data in LLMs. OpenAI Updates 🌀 OpenAI and Elon Musk: In a recent blog post, OpenAI shared its mission to ensure AGI benefits all of humanity, emphasizing the need for substantial resources. The post recounts disagreements with Elon Musk over funding and control, leading to his departure. OpenAI highlights its efforts to create widely available beneficial tools, such as GPT-4, and addresses ongoing legal disputes with Musk while reaffirming its commitment to its mission. Email Forwarded? Join DataPro Here!🔍 From Bits to BERT: Keeping Up with LLMs & GPTs 🧞 Google’s Genie: Generative Interactive Environments. Genie introduces a new generative AI paradigm for creating interactive, playable environments from a single image prompt. It can generate virtual worlds from unseen images, including real-world photos or sketches. Trained on a large dataset of Internet videos without action labels, Genie learns fine-grained controls, identifying controllable parts of an observation and inferring consistent latent actions across different environments.  🌀 Meta AI's Priority Sampling: Revolutionizing Machine Learning with Deterministic Code Generation. This research introduces Priority Sampling, a deterministic sampling technique for large language models that generates unique and confident code samples. It aims to improve code generation and optimization by providing a more structured and controllable exploration process, outperforming traditional sampling methods and enhancing model performance. 🌀 Google DeepMind Launches Hawk and Griffin: Efficient Language Models with Advanced Attention Mechanisms. This paper introduces Hawk, an RNN with gated linear recurrences, and Griffin, a hybrid model combining gated linear recurrences and local attention. Hawk outperforms Mamba on downstream tasks, while Griffin matches Llama-2's performance with significantly less training data. Both models are hardware-efficient, with Griffin showing exceptional scalability and the ability to extrapolate on long sequences. The study also details efficient distributed training for large-scale models. 🌀 CMU Unveils OmniACT: Groundbreaking AI Dataset for Measuring Program Execution Skills. OmniACT is a new dataset and benchmark designed to test if virtual agents can automate computer tasks by creating executable scripts. Initial tests show a significant gap between agent and human performance, highlighting the challenge and encouraging advancements in multimodal AI models. 🌀 Qualcomm's GPTVQ: Speeding Up Large AI Networks with Vector Quantization. GPTVQ is a new fast method for post-training vector quantization of Large Language Models (LLMs), improving size vs. accuracy trade-offs. It uses column-wise quantization and updates with Hessian information, efficient codebook initialization, and further compression techniques. GPTVQ sets new standards in LLM quantization efficiency and latency, even on mobile CPUs.   🌀 Azure PyRIT: Elevating ML Engineers with Python's Generative AI Risk Tool. PyRIT, a Python Risk Identification Tool for generative AI, automates AI Red Teaming tasks to assess the security of Language Model (LLM) endpoints. It employs proactive methods, categorizes risks, and offers detailed metrics, enabling researchers to mitigate potential risks in LLM deployment effectively. 🌀 Microsoft Introduces ChunkAttention: Accelerating Self-Attention for LLMs! This research introduces ChunkAttention, a novel self-attention module for large language models (LLMs) that optimizes compute and memory operations by detecting shared prefixes in LLM requests. It breaks key/value tensors into chunks and uses a prefix tree to share them, speeding up the self-attention kernel by 3.2-4.8×. ✨ On the Radar: Catch Up on What's Fresh🌀 Streamline Your Machine Learning Workflow with Scikit-learn Pipelines: This blog explores the benefits of using Scikit-learn pipelines for simplifying machine learning workflows. It covers how pipelines can streamline preprocessing, modeling, hyperparameter tuning, and workflow organization, making code more efficient and maintaining consistency in data preprocessing. 🌀 Do text embeddings perfectly encode text? The rapid advancement of generative AI has led to the widespread adoption of Retrieval Augmented Generation (RAG) systems, where AI retrieves relevant documents from a database to generate responses. This has given rise to vector databases, designed to store and search through embeddings, vector representations of documents. The paper "Text Embeddings Reveal as Much as Text" explores the security of embedding vectors, questioning whether they can be inverted back to text, posing challenges for privacy and information security. 🌀 End to End AI Use Case-Driven System Design: This blog explores the complexities of AI system performance beyond TOPs (Tera Operations Per Second), focusing on real AI use cases. It dives into optimizing an AI system for an infinite zoom feature, emphasizing power efficiency through model and memory optimizations, dynamic power scaling, and specialized hardware accelerators. 🌀 Navigating Cost-Complexity: Mixture of Thought LLM Cascades Illuminate a Path to Efficient Large Language Model Deployment: This post discusses how to significantly reduce costs while maintaining accuracy in utilizing Large Language Models (LLMs), crucial for various applications. It introduces a novel approach called Mixture of Thought (MoT) Cascades, employing a blend of weaker and stronger LLMs, along with innovative prompting techniques and consistency measurements.🌀 Structure and Relationships: Graph Neural Networks and a Pytorch Implementation. This article introduces Graph Neural Networks (GNNs), a powerful method for modeling spatial and graphical structures in data, such as molecular structures, social networks, and city designs. It covers the mathematical description of GNNs, including graph convolution networks (GCNs) and graph attention networks (GATs), and provides a regression example using the PyTorch library. The article aims to make GNNs more accessible by explaining their principles and demonstrating their potential applications. 🌀 Extensible and Customisable Vertex AI MLOps Platform: The article describes the development of an MLOps platform for scalable machine learning models on Vertex AI using Kubeflow pipelines. It aims to provide a modular, flexible, and integrated solution for building operationalized ML models, serving as an educational resource and foundation for teams. The platform addresses common challenges and emphasizes testing, configuration, and CI/CD orchestration. See you next time!Affiliate Disclosure: This newsletter contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. This supports our work and helps us keep providing useful content. We only recommend products and services we think will benefit our readers. Thanks for your support! 
Read more
  • 0
  • 0
  • 326

article-image-llmops-in-action
Mostafa Ibrahim
16 Apr 2024
6 min read
Save for later

LLMOps in Action

Mostafa Ibrahim
16 Apr 2024
6 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionIn an era dominated by the rise of artificial intelligence, the power and promise of Large Language Models (LLMs) stand distinct. These colossal architectures, designed to understand and generate human-like text, have revolutionized the realm of natural language processing. However, with great power comes great responsibility – the onus of managing, deploying, and refining these models in real-world scenarios. This article delves into the world of Large Language Model Operations (LLMOps), an emerging field that bridges the gap between the potential of LLMs and their practical application.BackgroundThe last decade has seen a significant evolution in language models, with models growing in size and capability. Starting with smaller models like Word2Vec and LSTM, we've advanced to behemoths like GPT-3, BERT, and T5.  With that said, as these models grew in size and complexity, so did their operational challenges. Deploying, maintaining, and updating these models requires substantial computational resources, expertise, and effective management strategies.MLOps vs LLMOpsIf you've ventured into the realm of machine learning, you've undoubtedly come across the term MLOps. MLOps, or Machine Learning Operations, encapsulates best practices and methodologies for deploying and maintaining machine learning models throughout their lifecycle. It caters to the wide spectrum of models that fall under the machine learning umbrella.On the other hand, with the growth of vast and intricate language models, a more specialized operational domain has emerged: LLMOps. While both MLOps and LLMOps share foundational principles, the latter specifically zeros in on the challenges and nuances of deploying and managing large-scale language models. Given the colossal size, data-intensive nature, and unique architecture of these models, LLMOps brings to the fore bespoke strategies and solutions that are fine-tuned to ensure the efficiency, efficacy, and sustainability of such linguistic powerhouses in real-world scenarios.Core Concepts of LLMOpsLarge Language Models Operations (LLMOps) focuses on the management, deployment, and optimization of large language models (LLMs). One of its foundational concepts is model deployment, emphasizing scalability to handle varied loads, reducing latency for real-time responses, and maintaining version control. As these LLMs demand significant computational resources, efficient resource management becomes pivotal. This includes the use of optimized hardware like GPUs and TPUs, effective memory optimization strategies, and techniques to manage computational costs.Continuous learning and updating, another core concept, revolve around fine-tuning models with new data, avoiding the pitfall of 'catastrophic forgetting', and effectively managing data streams for updates. Parallelly, LLMOps emphasizes the importance of continuous monitoring for performance, bias, fairness, and iterative feedback loops for model improvement. To cater to the vastness of LLMs, model compression techniques like pruning, quantization, and knowledge distillation become crucial.How do LLMOps workPre-training Model DevelopmentLarge Language Models typically start their journey through a process known as pre-training. This involves training the model on vast amounts of text data. The objective during this phase is to capture a broad understanding of language, learning from billions of sentences and paragraphs. This foundational knowledge helps the model grasp grammar, vocabulary, factual information, and even some level of reasoning.This massive-scale training is what makes them "large" and gives them a broad understanding of language. Optimization & CompressionModels trained to this extent are often so large that they become impractical for daily tasks.To make these models more manageable without compromising much on performance, techniques like model pruning, quantization, and knowledge distillation are employed.Model Pruning: After training, pruning is typically the first optimization step. This begins with trimming model weights and may advance to more intensive methods like neuron or channel pruning.Quantization: Following pruning, the model's weights, and potentially its activations, are streamlined. Though weight quantization is generally a post-training process, for deeper reductions, such as very low-bit quantization, one might adopt quantization-aware training from the beginning.Additional recommendations are:Optimizing the model specifically for the intended hardware can elevate its performance. Before initiating training, selecting inherently efficient architectures with fewer parameters is beneficial. Approaches that adopt parameter sharing or tensor factorization prove advantageous. For those planning to train a new model or fine-tune an existing one with an emphasis on sparsity, starting with sparse training is a prudent approach.Deployment Infrastructure After training and compressing our LLM, we will be using technologies like Docker and Kubernetes to deploy models scalably and consistently. This approach allows us to flexibly scale using as many pods as needed. Concluding the deployment process, we'll implement edge deployment strategies. This positions our models nearer to the end devices, proving crucial for applications that demand real-time responses.Continuous Monitoring & FeedbackThe process starts with the Active model in production. As it interacts with users and as language evolves, it can become less accurate, leading to the phase where the Model becomes stale as time passes.To address this, feedback and interactions from users are captured, forming a vast range of new data. Using this data, adjustments are made, resulting in a New fine-tuned model.As user interactions continue and the language landscape shifts, the current model is replaced with the new model. This iterative cycle of deployment, feedback, refinement, and replacement ensures the model always stays relevant and effective.Importance and Benefits of LLMOpsMuch like the operational paradigms of AIOps and MLOps, LLMOps brings a wealth of benefits to the table when managing Large Language Models.MaintenanceAs LLMs are computationally intensive. LLMOps streamlines their deployment, ensuring they run smoothly and responsively in real-time applications. This involves optimizing infrastructure, managing resources effectively, and ensuring that models can handle a wide variety of queries without hiccups.Consider the significant investment of effort, time, and resources required to maintain Large Language Models like Chat GPT, especially given its vast user base.Continuous ImprovementLLMOps emphasizes continuous learning, allowing LLMs to be updated with fresh data. This ensures that models remain relevant, accurate, and effective, adapting to the evolving nature of language and user needs.Building on the foundation of GPT-3, the newer GPT-4 model brings enhanced capabilities. Furthermore, while ChatGPT was previously trained on data up to 2021, it has now been updated to encompass information through 2022.It's important to recognize that constructing and sustaining large language models is an intricate endeavor, necessitating meticulous attention and planning.ConclusionThe ascent of Large Language Models marks a transformative phase in the evolution of machine learning. But it's not just about building them; it's about harnessing their power efficiently, ethically, and sustainably. LLMOps emerge as the linchpin, ensuring that these models not only serve their purpose but also evolve with the ever-changing dynamics of language and user needs. As we continue to innovate, the principles of LLMOps will undoubtedly play a pivotal role in shaping the future of language models and their place in our digital world.Author BioMostafa Ibrahim is a dedicated software engineer based in London, where he works in the dynamic field of Fintech. His professional journey is driven by a passion for cutting-edge technologies, particularly in the realms of machine learning and bioinformatics. When he's not immersed in coding or data analysis, Mostafa loves to travel.Medium
Read more
  • 0
  • 0
  • 378

article-image-databricks-dbrx-stability-ais-stable-code-instruct-3b-sambanovas-samba-coe-v02-frugalgpt-advanced-rag-patterns-on-amazon-sagemaker
Merlyn Shelley
02 Apr 2024
10 min read
Save for later

Databricks' DBRX, Stability AI's Stable Code Instruct 3B, SambaNova's Samba CoE v0.2, FrugalGPT, Advanced RAG Patterns on Amazon SageMaker

Merlyn Shelley
02 Apr 2024
10 min read
Subscribe to our Data Pro newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,Welcome to DataPro#87 – Your Gateway to the Cutting-Edge of Data Science & Machine Learning! 🚀 Dive into this edition to explore: ⚙️ LLMs & GPTs Unleashed Samba CoE v0.2: SambaNova's Speedy AI Models Efficient Training of Language Models with OpenAI AI21's Revolutionary SSM-Transformer Model: Jamba Databricks' DBRX: The New Open LLM Benchmark Stable Code Instruct 3B: Stability AI's Latest Offering HyperLLaVA: Boosting Multimodal Language Models ✨ What's Fresh & Exciting FrugalGPT: Cutting LLM Operating Costs Building a Reliable AI Agent from Scratch with OpenAI Tool Calling Fine-Tuning Instruct Models over Raw Text Data Crafting an OpenAI-Compatible API ⚡ Industry Pulse:  Deciphering Advanced RAG Patterns on Amazon SageMaker Unveil the Future with AutoBNN: Mastering Probabilistic Time Series Forecasting! Engaging with Microsoft Copilot (web): Learning from Interaction 📚 Packt's Latest Gem "Principles of Data Science - Third Edition" by Sinan Ozdemir DataPro Newsletter is not just a publication; it’s a comprehensive toolkit for anyone serious about mastering the ever-changing landscape of data and AI. Grab your copy and start transforming your data expertise today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."We appreciate your input and hope you enjoy the book!Share your Feedback!Cheers,Merlyn ShelleyEditor-in-Chief, PacktSign Up | Advertise | Archives🔰 GitHub Finds: Any of These Repos in Your Toolbox?🛠️ Zejun-Yang/AniPortrait: AniPortrait is a new framework for creating high-quality animations using audio input and a reference portrait image, with face reenactment capabilities.🛠️ agiresearch/AIOS: AIOS embeds large language models into operating systems, enabling smarter resource allocation, context switching, and concurrent agent execution, advancing AGI. 🛠️ lichao-sun/Mora: Mora is a multi-agent framework for video generation, enhancing OpenAI's Sora capabilities through collaborative visual agents for diverse tasks. 🛠️ jasonppy/VoiceCraft: VoiceCraft is a high-performing neural codec language model for speech editing and zero-shot text-to-speech, excelling with diverse real-world data. 🛠️ dvlab-research/MiniGemini: Mini-Gemini enhances LLMs (Large Language Models) from 2B to 34B, integrating image understanding, reasoning, and generation, inspired by LLaVA. 🛠️ Picsart-AI-Research/StreamingT2V: StreamingT2V is a technique for creating long videos with rich motion dynamics, ensuring temporal consistency and high image quality. 📚 Expert Insights from Packt Community"Principles of Data Science - Third Edition" by Sinan Ozdemir. The Five Steps of Data Science A question I’ve gotten at least once a month for the past decade is What’s the difference between data science and data analytics? One could argue that there is no difference between the two; others will argue that there are hundreds of differences! I believe that, regardless of how many differences there are between the two terms, the following applies: Data science follows a structured, step-by-step process that, when followed, preserves the integrity of the results and leads to a deeper understanding of the data and the environment the data comes from. As with any other scientific endeavor, this process must be adhered to, or else the analysis and the results are in danger of scrutiny. On a simpler level, following a strict process can make it much easier for any data scientist, hobbyist, or professional to obtain results faster than if they were exploring data with no clear vision. While these steps are a guiding lesson for amateur analysts, they also provide the foundation for all data scientists, even those in the highest levels of business and academia. Every data scientist recognizes the value of these steps and follows them in some way or another. Overview of the five steps The process of data science involves a series of steps that are essential for effectively extracting insights and knowledge from data. These steps are presented as follows: Asking an interesting question: The first step in any data science project is to identify a question or challenge that you want to address with your analysis. This involves finding a topic that is relevant, important, and that can be addressed with data. Obtaining the data: Once you have identified your question, the next step is to collect the data that you will need to answer it. This can involve sourcing data from a variety of sources, such as databases, online platforms, or through data scraping or data collection methods. Exploring the data: After you have collected your data, the next step is to explore it and get a better understanding of its characteristics and patterns. This might involve examining summary statistics, visualizing the data, or applying statistical or machine learning (ML) techniques to identify trends or relationships. Modeling the data: Once you have explored your data, the next step is to build models that can be used to make predictions or inform decision-making. This might involve applying ML algorithms, building statistical models, or using other techniques to find patterns in the data. Communicating and visualizing the results: Finally, it’s important to communicate your findings to others in a clear and effective way. This might involve creating reports, presentations, or visualizations that help to explain your results and their implications. By following these five essential steps, you can effectively use data science to solve real-world problems and extract valuable insights from data. It’s important to note that different data scientists may have different approaches to the data science process, and the steps outlined previously are just one way of organizing the process. Some data scientists might group the steps differently or include additional steps such as feature engineering or model evaluation. Despite these differences, most data scientists agree that the steps listed previously are essential to the data science process. Whether they are organized in this specific way or not, these steps are all crucial for effectively using data to solve problems and extract valuable insights. Let’s dive into these steps one by one.Discover more insights from "Principles of Data Science - Third Edition" by Sinan Ozdemir. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today!    Read Here!⚡ Tech Tidbits: Stay Wired to the Latest Industry Buzz! AWS ML Made Easy 🌀 Advanced RAG patterns on Amazon SageMaker: This post discusses how customers across various industries are utilizing large language models (LLMs) like Mixtral-8x7B Instruct to build generative AI applications such as QnA chatbots and search engines. It highlights the challenges and solutions in improving the accuracy and performance of these applications, focusing on Retrieval Augmented Generation (RAG) patterns implemented with LangChain.Google Research 🌀 AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks. This research introduces AutoBNN, an open-source package for automated, interpretable time series forecasting using Bayesian neural networks (BNNs). It addresses limitations of traditional methods like Gaussian processes (GPs) and Structural Time Series by combining the interpretability of GPs with the scalability and flexibility of neural networks. AutoBNN automates model discovery, provides high-quality uncertainty estimates, and scales effectively for large datasets. Microsoft Research🌀 Learning from interaction with Microsoft Copilot (web): This research focuses on how AI systems like Bing and Microsoft Copilot learn and improve from user interactions, particularly through reinforcement learning from human feedback (RLHF). It also explores how Bing has evolved its search capabilities and how Copilot is changing user interactions to be more conversational and workflow oriented. The research introduces frameworks like TnT-LLM and SPUR to improve taxonomy generation and user satisfaction estimation in AI interactions. Email Forwarded? Join DataPro Here!🔍 From Bits to BERT: Keeping Up with LLMs & GPTs 🌀 Samba CoE v0.2 from SambaNova delivers accurate AI models at blazing speeds: This blog post highlights Samba's advancements in AI architecture, specifically focusing on the introduction of Samba-1, a CoE architecture for enterprise AI. It discusses the features and benefits of Samba-1, its performance benchmarks, and plans for future releases, emphasizing the role of RDUs in driving efficiency and speed in AI models. 🌀 OpenAI’s Efficient Training of Language Models to Fill in the Middle: OpenAI demonstrates that autoregressive language models can effectively learn to infill text by moving a span of text from the middle of a document to its end, without harming generative capability. They propose training models with this method by default and provide benchmarks and best practices. 🌀 Jamba: AI21's Groundbreaking SSM-Transformer Model. Jamba is a groundbreaking model that merges Mamba SSM with Transformer elements, offering a 256K context window and outperforming similar models. Released under Apache 2.0, it will be available in the NVIDIA API catalog. Jamba optimizes memory, throughput, and performance, delivering remarkable efficiency. 🌀 Databricks’ DBRX: A New State-of-the-Art Open LLM. Databricks introduces DBRX, an open LLM setting new benchmarks in language understanding, programming, and math. With a 256K context window, it outperforms GPT-3.5 and competes with Gemini 1.0 Pro. DBRX is 40% smaller than Grok-1, offering 2x faster inference than LLaMA2-70B. 🌀 Introducing Stable Code Instruct 3B — Stability AI: Stable Code Instruct 3B, built on Stable Code 3B, offers state-of-the-art performance in code completion and natural language interactions for programming tasks. It outperforms Codellama 7B Instruct and matches StarChat 15B, with a focus on popular languages like Python and Java. Available for commercial use with a Stability AI Membership, the model is accessible on Hugging Face. 🌀 HyperLLaVA: Enhancing Multimodal Language Models with Dynamic Visual and Language Experts. This blog explores the advancements in Multimodal Large Language Models (MLLMs) and introduces HyperLLaVA, a dynamic model that improves performance by adaptively tuning parameters for handling diverse multimodal tasks, surpassing existing benchmarks and opening new avenues for multimodal learning systems. ✨ On the Radar: Catch Up on What's Fresh🌀 FrugalGPT and Reducing LLM Operating Costs: The blog discusses the high cost of running Large Language Models (LLMs) and introduces the "FrugalGPT" framework, which reduces operating costs significantly while maintaining quality. It explains how different models cost different amounts and proposes using a cascade of LLMs to minimize costs while maximizing answer quality. 🌀 Leverage OpenAI Tool calling: Building a reliable AI Agent from Scratch. The blog discusses the future role of AI in everyday tasks, focusing on text creation, correction, and brainstorming. It highlights the importance of Retrieval-Augmented Generation (RAG) pipelines and aims to provide Large Language Models with better context to generate more valuable content. 🌀 Fine-tune an Instruct model over raw text data: The blog explores the challenges of integrating modern chatbots with large datasets, focusing on context window sizes and the use of Retrieval-Augmented Generation (RAG) techniques. It proposes a lighter approach to fine-tuning chatbots on smaller datasets, aiming to bridge the gap between the constraints of a 128K context window and the complexities of models fine-tuned on billions of tokens. The experiment involves fine-tuning a model on The Guardian's dataset and aims to provide reproducible instructions for cost-effective model training using accessible hardware. 🌀 How to build an OpenAI-compatible API: The blog discusses the dominance of OpenAI in the Gen AI market, and the reasons developers might choose alternative LLM providers. It explores implementing a Python FastAPI server compatible with the OpenAI API specs to wrap any LLM, aiming for flexibility and cost-effectiveness. See you next time!
Read more
  • 0
  • 0
  • 269

article-image-ai-distilled-39-unpacking-mistral-large-googles-gemini-challenges-and-copilot-enterprise
Kartikey Pandey
21 Mar 2024
9 min read
Save for later

AI_Distilled #39: Unpacking Mistral Large, Google's Gemini Challenges, and Copilot Enterprise

Kartikey Pandey
21 Mar 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Print to Pixel: Optimize your learning experience with PacktSeveral research studies have proven that printed books enhance comprehension, with the tactile experience of flipping pages and annotating the margins adding depth to the learning experience. However, developers can't overlook the practical benefits of eBooks, such as quickly finding relevant information or carrying an entire library on a single device.Acknowledging the unique benefits of both formats, Packt is offering a 40% discount on all print books, plus a free eBook version of each purchase, from February 26th to February 29th.Here’s what’s included:A Vast Library: Enjoy 40% off on over 5,000 titles spanning topics from Cybersecurity to Generative AI.Complimentary eBook: Each print book purchase includes a free eBook.AI Assistant: Top 500 books come with a personalized AI that can simply complex topics to your learning style, offering an interactive learning experience.Start Building Your Tech Library Today!👋 Hello,“No Al is perfect, especially at this emerging stage of the industry’s development, but we know the bar is high for us and we will keep at it for however long it takes.”-Sundar Pichai, Google CEOPichai acknowledges problems with Gemini AI, stressing the importance of unbiased information for users, and outlining steps to address issues and improve products. A rapidly progressing industry, AI development is a tricky game to master, with numerous pitfalls along the way.Greetings readers! Our mission is to help you stay on top of the ever-changing AI landscape so you can advance your skills. Let’s get started with the latest news and developments across the AI field:Microsoft provides new LLM Mistral Large on Azure with Mistral AIGoogle accepts some responses from their Gemini were unacceptable and biasedGitHub has launched Copilot Enterprise coding assistant integrating throughout the software development processResearchers developed new optimized language models called MobileLLM for mobile devices with under a billion parametersResearchers at Microsoft have developed new techniques to improve visual language modelsWe’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials:Mastering the Art of Prompt CraftingBreaking Down How Large Language Models LearnUsing AI to Level Up Live GamesMonitoring Large Language Models on AWSLast but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Fine-Tuning Models for Speech Recognition Made SimpleMake Conversation Come Alive - Deploying Your Own AI Chat PartnerCombining Geospatial and Semantic Data to Build Powerful Search ToolsLeveraging Notion, Supabase and AI for Knowledge RetrievalWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers,  Kartikey Pandey  Editor-in-Chief, Packt  Unleash Your Data Potential with Packt's Latest Titles and Platform Enhancements! In a world that's always changing, learning is key to success. At Packt, we've updated our learning platform to help you stay ahead in the fast-moving tech world. Our platform makes learning easier and more effective, helping you overcome challenges and achieve your goals. Boost Your Data Skills with Packt's DataPro Library: On-Demand Learning: Access a wide range of books, video courses, research papers, and articles to help you grow. AI Assistance: Get help from AI to understand complex concepts easily, all within the same learning environment.Personalized Dashboard: Enjoy a tailored learning experience with recommendations and insights just for you. Advanced Self-Assessment: Use the latest tools to identify what you need to learn and track your progress accurately. Vibrant Community: Join a community of data and AI enthusiasts on Discord for collaboration and knowledge sharing. Exclusive Access: Be part of the DataPro beta program for a chance to win Amazon gift cards and early access to new features. Value for Money: Get all these benefits for just $7.99 per month, a small investment for big gains in your careerEnhance Your Data Skills Today⚡ TechWave: AI/GPT News & AnalysisMicrosoft has partnered with Mistral AI to provide their new LLM Mistral Large on Azure cloud services. This state-of-the-art AI model offers advanced NLP capabilities. Several companies have praised Mistral Large's performance in increasing productivity and aiding innovation.Google's CEO recently said some responses from their AI model Gemini were unacceptable and biased. The company has been working to address these issues and sees improvements but will review what happened. They plan to relaunch Gemini in the coming weeks after fixing it.GitHub has launched Copilot Enterprise, an AI coding assistant that integrates throughout the software development process. It provides customized code suggestions based on an organization's codebase, answers questions about internal systems, and generates summaries of code changes. Early testing found massive productivity gains from such AI tools.Researchers have developed new optimized language models for mobile devices with under a billion parameters. Called MobileLLM, the models achieve higher accuracy than previous smaller models through innovative architecture and weight-sharing techniques. MobileLLM shows significant gains on conversation tasks and competes with much larger models for common on-device uses.Researchers at Microsoft have developed new techniques to improve visual language models using structured knowledge graphs. By incorporating relationship maps between image elements like objects and attributes, models can generate richer images from text descriptions. Hierarchical prompting and dual-path encoding methods were also introduced to help models better understand complex language.🌟 Secret Knowledge: AI/LLM Resources🌀 Mastering the Art of Prompt Crafting: Got a new NLP project that needs prompting? This guide covers the basics of effective prompt engineering for AI models like ChatGPT. Learn how clarity, conciseness, and context can improve responses. Also explore techniques like zero-shot learning and dynamic few shots, plus how temperature, top-p, and other settings can refine your model's "personality". From system messages to tailoring examples, these tips will help you leverage your LLMs' full potential.🌀 Breaking Down How Large Language Models Learn: This article provides a helpful breakdown of how LLMs are trained through causal language modeling and calculates loss. It visually explains how models generate text sequences, are pre-trained to predict the next token, and how cross-entropy loss compares predictions to true labels to update weights. The process is demonstrated through code showing how loss is manually calculated for an LLM matching the framework's automatic calculation. This gives developers valuable insights into how state-of-the-art models learn.🌀 Using AI to Level Up Live Games: This article discusses how generative AI can enhance live service games. Techniques like adaptive gameplay, personalized ads, and faster asset creation are described. The authors provide a framework for developing games using tools like Unity, GKE, and Vertex AI. They demonstrate how ML models can dynamically generate images, code and dialogue to customize the player experience. Whether deploying models on GKE or Vertex, cloud-based AI brings the benefits of lower costs and easier maintenance than self-hosted options. 🌀 Monitoring Large Language Models on AWS: As AI language models grow more advanced, ensuring they behave properly becomes more important. This article discusses techniques for monitoring LLMs deployed on AWS. Key metrics covered include semantic similarity of responses, sentiment analysis, refusal rates, and more. The proposed architecture takes in model outputs, runs metrics modules, and reports results to CloudWatch for aggregation and alerts. With the right monitoring in place, you can help keep your conversational AI acting as intended.🔛 Masterclass: AI/LLM Tutorials🌀 Fine-Tuning Models for Speech Recognition Made Simple: This article discusses how to fine-tune LLMs for automatic speech recognition tasks using Amazon SageMaker. It explains language models and ASR as well as the basic steps for fine-tuning a pre-trained model which includes preparing data, choosing a model, training, evaluating, and deploying. SageMaker is highlighted as a powerful yet easy-to-use platform for this process due to its scalability, integration with AWS services, and pay-as-you-go pricing.🌀 Make Conversation Come Alive - Deploying Your Own AI Chat Partner: Tired of boring chatbots? This guide shows you how to bring the amazing Qwen AI model to your own server so you can have engaging discussions on any topic. The steps cover setting up your environment, installing dependencies, initializing the tokenizer and model, and using history to keep conversations flowing naturally. Once complete, you'll have a powerful AI assistant right at your fingertips. Best of all, it's completely open source.🌀 Combining Geospatial and Semantic Data to Build Powerful Search Tools: This guide shows developers how to create an interactive campground search map using vector databases, NLP models, and geospatial data. Technologies like Qdrant, Llama2, and Streamlit allow embedding text and locations to enable semantic queries. The page explains setting up Qdrant cloud, loading campground CSV data, and parsing text into nodes. Developers can then embed nodes with HuggingFace and query the vector store to retrieve similar results. By leveraging tools that understand both spatial and semantic context, you can build customized applications to help users explore outdoor destinations.🌀 Leveraging Notion, Supabase, and AI for Knowledge Retrieval: This tutorial shows how you can build a knowledge base by extracting data from Notion databases and storing it in a vector format in Supabase. It then demonstrates retrieving relevant information from the knowledge base using an AI model from OpenAI. By combining these tools, developers can query custom datasets and generate responses based on retrieved documents. The process involves loading Notion documents, storing embeddings in Supabase, and setting up a retrieval pipeline. With some enhancements, this could be a powerful way to access organizational information.🚀 HackHub: Trending AI Tools🌀 lucky-lance/expert_sparsity: Implements efficient expert pruning and dynamic skipping techniques for mixture-of-experts large language models to improve their efficiency and speed while maintaining strong performance.🌀 facebookresearch/pearl: This open-source library provides a modular reinforcement learning framework for building and training production-ready AI agents, empowering developers with state-of-the-art techniques.🌀 zhen-tan-dmml/llm4annotation: Curates papers on using LLMs for data annotation, which developers could reference to apply these techniques or learn about the current state of the art.🌀 google/gemma.cpp: Provides a lightweight C++ library for running Google's Gemma models that developers can easily integrate into their own projects for experimenting with and deploying LLMs.
Read more
  • 0
  • 0
  • 1145

article-image-ai-distilled-38-latest-in-ai-sora-gemini-15-and-more
Merlyn Shelley
01 Mar 2024
9 min read
Save for later

AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Merlyn Shelley
01 Mar 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,“People say AI is overhyped, but I think it's not hyped enough. The next generation who will use this in the next few years will have a much higher bar on what technology can do for them. So how you build it for that generation, how you build it for that future will be really interesting to see.”-Puneet Chandok, Microsoft India and South Asia presidentSpeaking at a panel discussion on AI at the Mumbai Tech Week, Chandok believes AI is not hyped enough considering its potential for disruptive transformation. He encourages more training on AI to realize its full potential.Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:OpenAI unveils Sora, an AI model generating videos from textGoogle's latest conversational AI model Gemini 1.5 has a million-token context windowNew AI news reader app tackles clickbait headlines, provides summariesSlack is rolling out new AI features for enterprise users including thread summariesLangChain announced raising $25 million to launch new platform for building LLM appsAI helps improve medical imaging to benefit patients globallyResearchers develop AI model that determines a person's sex from brain scansWe’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:Giving AI Models a Better Memory: How Google DeepMind Expanded Context WindowsAdvanced Techniques For More Relevant AI ResponsesReinforcement Learning ExplainedBridging the Gap Between AI and App DevelopmentFinally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:Creating Custom Models Without the Hassle of Data CollectionCode Your Own AI Coding BuddyEvaluating Code Quality with AI AssistantsEasily Deploy Language Models LocallyLooking for some inspiration? Here are some GitHub repositories to get your projects going!gptscript-ai/gptscriptkarpathy/minbpeAAAI-DISIM-UnivAQ/DALIQwenLM/QwenWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers,  Kartikey Pandey  Editor-in-Chief, Packt  ⚡ TechWave: AI/GPT News & AnalysisOpenAI unveiled Sora, an AI model generating videos from text at up to a minute in length. Sora demonstrates an understanding of language and the physical world and photorealism across styles, though human subjects appear game-like.Google's latest conversational AI model Gemini 1.5 analyzes more information than before, thanks to a million-token context window. This allows for summarizing the Apollo 11 mission transcript or analyzing a 44-minute silent film in full. Early results show the system maintains performance as context grows into the millions.Bulletin, a new AI-powered news reader app, tackles clickbait headlines and provides summaries of news articles with customizable news sources.Slack is rolling out new AI features for enterprise users including thread summaries, channel recaps, and answering workplace questions. The tools provide highlights from missed messages and help catch up.LangChain announced raising $25 million to launch their new platform LangSmith for building and monitoring LLM apps. LangSmith allows developers to accelerate workflows across development, testing, deployment, and monitoring. It has already seen significant adoption with over 70,000 signups and 5000 monthly active companies.Courtesy: Bulletin/Shihab MehboobAI is helping improve medical imaging to benefit patients globally. ML can quickly analyze large datasets to find issues doctors may miss and flag urgent cases. Cloud solutions also enable sharing scans and remote expert assistance anywhere. Companies are applying these methods to speed diagnoses, reduce wait times, and bring ultrasounds directly to homes. Researchers have also developed an AI model that can determine a person's sex from brain scans with over 90% accuracy. The model analyzed dynamic MRI scans and identified the default mode, striatum, and limbic networks as key in distinguishing male and female brains. This breakthrough furthers our understanding of brain organization and could help address sex-specific health issues. 🔮 Expert Insights from Packt Community Generative AI with LangChain - By Dr. Ben AuffarthChatGPT and the GPT models by OpenAI have brought about a revolution not only in how we write and research but also in how we can process information.This book discusses the functioning, capabilities, and limitations of LLMs underlying chat systems, including ChatGPT and Bard. It also demonstrates, in a series of practical examples, how to use the LangChain framework to build production-ready and responsive LLM applications for tasks ranging from customer support to software development assistance and data analysis Key TakeawaysExplore the expansive utility of LLMs in real-world applications.Guidance on fine-tuning, prompt engineering, and best practices.Learn how to use the LangChain framework to build production-ready LLM applications.By the end of this book, you'll be equipped with the practical knowledge and skills to leverage the transformative power of generative AI with confidence and creativity.Read More🌟 Secret Knowledge: AI/LLM Resources🌀 Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows: Google DeepMind's latest AI model Gemini 1.5 has significantly improved how much information it can process at once, thanks to advances in "long context windows." The team discovered their model could understand over 1 million pieces of information in a single sitting, far surpassing earlier limits. This opens up new possibilities for tasks like summarizing lengthy documents, analyzing large codebases, and even comprehending full movies. Developers are excited to explore creative uses of this expanded recall.🌀 Advanced Techniques For More Relevant AI Responses: This article discusses how to improve AI conversation models like RAG by enhancing how information is stored, found and used. Methods covered include indexing sentences individually while keeping their surrounding context, combining keyword search with semantic search, and re-scoring results based on the question. The author demonstrates implementing these "advanced RAG" techniques in Python using tools like LlamaIndex and Weaviate. With these optimizations, AI systems can provide more helpful responses by accessing knowledge in a targeted manner.🌀 Reinforcement Learning Explained: This article breaks down the key concepts of reinforcement learning in an easy-to-understand way. It covers states, actions, rewards, and how agents interact with environments to learn policies. RL agents try different strategies to maximize long-term rewards through trial and error. Episodes provide a framework to evaluate policies. Deterministic policies pick set actions while stochastic policies use probabilities. Whether you're new to RL or a veteran, this primer is worth a read to get acquainted with the basics.🌀 Bridging the Gap Between AI and App Development: As AI becomes more advanced, developers need easier ways to integrate cutting-edge features into their work. However, directly using AI code frameworks can be challenging and limit scalability. The solution? AI gateways. By handling tasks like routing, caching, and monitoring behind the scenes, gateways act as a bridge between complex AI systems and traditional development workflows. They streamline the integration process while ensuring high performance. Are gateways the future of intelligent applications?Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a try!🔛 Masterclass: AI/LLM Tutorials🌀 Creating Custom Models Without the Hassle of Data Collection: Tired of spending big bucks to use proprietary AI APIs or going through the tedious process of collecting your training data? This page shows how you can train customized models more efficiently. By using an open-source LLM to generate synthetic annotations for a small sample of your data, you can then fine-tune a smaller model tailored exactly to your needs. The process takes just a few steps and allows you to analyze large datasets for a fraction of the cost. Best of all, you avoid sending sensitive data to third parties.🌀 Code Your Own AI Coding Buddy: This guide shows you how to build an AI assistant that lives right on your computer. Using tools like HuggingFace and Streamlit, you can create a chatbot trained on Code Llama. Simply ask it questions and it will respond with examples in languages like Python, Java, and C++. Better yet, the models are free and open-source. This is a neural net sidekick to help automate repetitive tasks and speed up your workflow.🌀 Evaluating Code Quality with AI Assistants: This article explores using AI to improve code quality by testing Python scripts with SonarQube and getting feedback from LLMs. The author ran tests on ChatGPT and open-source models like Code Llama to see if they could identify issues flagged by SonarQube. While the models struggled to pinpoint errors solely from descriptions, some provided insightful summaries. Continued development of coding-focused LLMs may help automate part of the review process.🌀 Easily Deploy Language Models Locally: With a simple four-step process, you can get powerful language models like ChatGPT running on your hardware. First, choose a model from HuggingFace and quantize it for faster performance. Then build an Ollama image to serve the model. For a slick interface, deploy a ChatGPT-style React app talking to Ollama via Docker. The whole setup only takes around 15 minutes. Now you've got a custom language assistant without internet dependence.🚀 HackHub: Trending AI Tools🌀 gptscript-ai/gptscript: Open source NLP tool that allows developers to automate tasks by writing scripts in plain English.🌀 karpathy/minbpe: Minimal and clean Python code for the byte pair encoding algorithm commonly used in NLP and language model tokenization.🌀 AAAI-DISIM-UnivAQ/DALI: Framework allowing developers to build multi-agent systems in Prolog for applications like robotics, event processing, and more.🌀 QwenLM/Qwen: Open source code, models, and documentation for the Qwen series of LLMs, including Qwen, Qwen-Chat, and their various sizes.
Read more
  • 0
  • 0
  • 936
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-building-an-llm-powered-app-using-snowflake-and-streamlit
Ryan Goodman
30 Jan 2024
11 min read
Save for later

Building an LLM-powered App using Snowflake and Streamlit

Ryan Goodman
30 Jan 2024
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionFor years, self-service analytics apps have enabled both information consumers (business users) and information workers (analysts) to meet their need for data assets that aid analysis and problem-solving. These data assets can include ready-made insights and analysis in the form of statistics, visual stories, or formatted data for further discovery. Historically, for an enterprise to embark on creating analytics apps, it required a specialized skillset, technology tools, and a steep learning curve to deliver value.Three significant trends have shifted how we view analytics apps today:●  No-code and low-code data acquisition, along with cloud data/warehouse platforms, have helped democratize the data platform.●  Data platforms like Snowflake are designed to bring analytics computing into a single platform where data no longer needs to be copied and moved.●  The democratization of machine learning and the widespread availability of powerful generative AI models have changed the entire user experience and expectations for information discovery and natural language exploration.The result of these trends has accelerated technology cycles and the rate of innovation in unprecedented ways. Prudent technology and business leaders are strained with more requests and fewer resources to use data to build information-focused businesses.Currently, we have AI app and analytics waves breaking at the same time with different use cases in mind but the same objective. For this article, we wanted to explore the basics of building a simple analytics app inside of Snowflake, allowing an OpenAI interface to execute code without ever accessing any of the resulting data.Modern Data Cloud and Analytics Technology ToolsLet us explore the process and benefits of building an LLM-powered application using a cloud-based data warehousing platform like Snowflake and an open-source Python library for creating web applications like Streamlit. Ref: https://www.snowflake.com/blog/building-python-data-apps-streamlit/Understanding Snowflake Data Warehousing Snowflake is a leading cloud data platform offering secure and scalable solutions for processing and storing data. The architecture of Snowflake allows easy integration with programming languages. It eventually works on data-intensive applications. To work with Snowflake, one must create a Snowflake account to set up the database for data storage.LLM Powered Inputs and TranslationEvery large language model, including GPT-4, is capable of understanding and generating human-like texts based on prompts and inputs it receives. These models are trained on vast datasets, enabling them to comprehend large and complex language patterns and generate contextually relevant responses. An incredible aspect of large language models, particularly GPT-4, is their ability to effectively translate natural language into code, including SQL and Python.Large language models are not designed for computational procedures like statistics and analytics, but with the right prompting and, most importantly, context, you can streamline many common tasks.Integration of Snowflake with Python and Streamlit SnowparkIn data analysis and machine learning (ML), Python is the most versatile programming language. Snowflake offers a Python connector that enables seamless communication between Snowflake databases and Python scripts. In this article, we are not using Snowpark.Storyboarding our AppThe difference between a good app and a great app lies in the value you create for your user. The secret to building a great app is empowering users to solve problems that would otherwise be painful or impossible due to a lack of skills. The app we are building here demonstrates how to fit technology components together.Minimum Viable Product Storyboard:●  End user: Analytics app developer●  Intent: Demonstrate core tech components●  Outcome: Have●  Value: Quickly understand a functional code example without having to researchWe will build a native Streamlit app inside of Snowflake:●  The app will feature a chat interface powered by ChatGPT.●  The chat history will be written on a Snowflake table.●  The GPT model will read the results of a simple query, interpret the results, and summarize them in plain English.Bringing Technology Components TogetherFor this article, we decided to build a simple end-to-end demonstration of how a native Snowflake app built with Python and Streamlit can utilize a chatbot interface that uses ChatGPT-4 to generate SQL code that can be executed natively in Snowflake with the context of the schema.Snowflake Integration of ChatGPT Large Language Model APITo receive responses with the help of a large language model, leverage the OpenAI Documentation and Playground. Obtain the OpenAI GPT Key, and then use the following code to interact with a large language model.-- Step 1 - Create a Secret for open ai key . CREATE OR REPLACE SECRET open_ai_api_key TYPE = GENERIC_STRING SECRET_STRING = '<OPEN_AI_KEY>'; -- Step 2 - Create a Network rule on Snowflake CREATE OR REPLACE NETWORK RULE openai_network_rule MODE = EGRESS TYPE = HOST_PORT VALUE_LIST = ('api.openai.com'); -- Step 3 Create a EXTERNAL ACCESS INTEGRATION in Snowflake CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION external_access_int ALLOWED_NETWORK_RULES = (openai_network_rule) ALLOWED_AUTHENTICATION_SECRETS = (open_ai_api_key) ENABLED = true; -- Step 4 Create a UDF using openai packages . Here we are using "gpt-3.5-turbo" Model CREATE OR REPLACE FUNCTION CHATGPTv1(query varchar) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION = 3.9 HANDLER = 'runner' EXTERNAL_ACCESS_INTEGRATIONS = (external_access_int) SECRETS = ('openai_key' = open_ai_api_key) PACKAGES = ('openai') AS $$ import _snowflake import openai def runner(QUERY):    openai.api_key = _snowflake.get_generic_secret_string('openai_key')    messages = [{"role": "user", "content": QUERY}]    model="gpt-3.5-turbo"    response = openai.ChatCompletion.create(model=model,messages=messages,temperature=0,)    return response.choices[0].message["content"] $$; -- Test your UDF SELECT CHATGPTv1('Hi')Creation of Streamlit User Experience InterfaceTo create the Streamlit user experience the following code was utilized to build a very basic functional prototype with GPT3.5 Turbo.1. Installation:pip install Streamlit2. Creation:from snowflake.snowpark.context import get_active_session st.set_page_config(layout="wide") st.title("OPEN AI IN SIS - GPT-3.5-turbo(MODEL)") st.write("##") st.write("##") # Get the current credentials session = get_active_session() if 'request_response' not in st.session_state:    st.session_state['request_response'] = {} if st.session_state['request_response']:    for itr in st.session_state['request_response'].keys():        request_col , request_col1 = st.columns(2)        response_col1 , response_col = st.columns(2)        with request_col:            st.write(f":bust_in_silhouette:  :blue[{itr}]")        st.write("##")        with response_col:            st.write(f":speech_balloon:  :red[{st.session_state['request_response'][itr][0]}]") col1 ,col2 = st.columns(2) with col1:    search_text= st.text_input("Send a message")    search_button = st.button("Send") if search_text and search_button:    search_result = session.sql(f"SELECT CHATGPTv1('{search_text}')").collect()    if search_result:        st.session_state['request_response'][search_text] = [search_result[0][0]]        st.experimental_rerun()3. Run:Streamlit run app.pyMoving from MVP to Real-World ApplicationReal-world analytics apps are designed with a narrow scope, outcome, and value in mind. Let's expand on the same technology components and formulate a real-world use case that will be more impactful to an enterprise. When evaluating real-world business cases to apply Streamlit and OpenAI, focus on use cases that deliver value frequently, to many (or important) people in your organization, and are tied to high-impact business processes.Data Tape Co-pilot Tool:●  End user: Financial Analysts, Business Analysts, Data Analysts.●  Intent: Deliver a data tape with the ability to constrain data to business needs and provide a basic summary.●  Outcome: End users can download the data tape and receive a plain English summary of key stats (record count, distinct key, constraints in the query contained in the WHERE clause).●  Value: Provide natural language access to a single, widely used data tape with a clear, plain English explanation of the dataset.Streamlit Analytics Improves User Adoption and Success with Snowflake With a better understanding of Streamlit as a driver for the adoption of Snowflake and the increasing adoption of data assets, let's dig deeper into Streamlit as the conduit for adoption. While Snowflake may be a known entity within your enterprise, few business-facing professionals will ever know they are interfacing with Snowflake, and that is okay. Without more technology tools and platforms, Streamlit opens the doors to Snowflake but most importantly eliminates other tools, platforms, and an additional layer of services to manage. Instead, you can leverage the skills already on hand within most data and analytics teams. Here are some additional features that make Streamlit quite compelling:●  Simplicity and Ease of Use: Streamlit provides an intuitive API that allows developers to create interactive UI elements with minimal code. Its straightforward syntax enables both beginners and experienced developers to quickly prototype and deploy applications without a steep learning curve.●  Rapid Prototyping: Streamlit excels at rapid prototyping, enabling developers to iterate quickly on their ideas. With its live reloading feature, developers can see changes in real time as they modify the code. This development speed is crucial for experimenting with different UI layouts and functionalities.●  Data Exploration and Visualization: Streamlit integrates seamlessly with popular data science libraries . Some of these are Pandas, Matplotlib, and Plotly. This integration allows developers to create dynamic and interactive charts, graphs, and dashboards with minimal effort. Data scientists and analysts can effectively showcase their findings, making it an excellent choice for data exploration and visualization tasks.●  Customization and Theming: While Streamlit provides a simple interface, it also offers customization options for developers who want to create visually appealing applications. Developers can customize the appearance of their apps, including layout, colors, and themes, to match their brand or specific design preferences.●  Seamless Integration with Machine Learning and AI Models: Streamlit makes integrating machine learning models, natural language processing tools, and other AI technologies into applications easy. Developers can create interactive interfaces for AI-powered applications, enabling users to interact with complex algorithms and models without understanding the underlying complexities.●  Sharing and Deployment: Streamlit apps can be easily shared and deployed on various platforms. Whether it's sharing within a team, showcasing a prototype to stakeholders, or deploying a full-fledged application for public use, Streamlit simplifies the process. Streamlit sharing, Streamlit's deployment platform, allows developers to deploy apps with minimal configuration, making them accessible to a broader audience.●  Active Community and Documentation: Streamlit has a vibrant and active community of developers. The availability of numerous examples, tutorials, and community-contributed components enhances the development experience. Streamlit's comprehensive documentation provides detailed guidance on various aspects of building interactive applications, making it easier for developers to find solutions to their queries.●  Flexibility and Extensibility: While Streamlit is easy for beginners, it also offers flexibility and extensibility for advanced users. Developers can create custom components and integrate JavaScript functionality when needed, allowing them to extend Streamlit's capabilities based on their requirements.ConclusionThe integration of Snowflake and Streamlit offers a powerful combination for building analytics and data delivery apps. A single, blended data warehousing solution with intuitive application development can democratize data access, enabling users across an organization to transform complex datasets into palatable, prepared information assets. Though the Snowflake modern data cloud app store is in its infancy, you can jump in today and seize a great opportunity to build powerful data apps. While this article explained a simple GPT API interface, the recent introduction of GPT Assistants API expands the possibilities for even more intelligent, contextual agents running securely running right where you work. I look forward to expanding on this basic prototype to a more intelligent co-pilot experience soon.Author BioRyan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.
Read more
  • 0
  • 0
  • 438

article-image-ai-distilled-34-empowering-education-through-ai
Merlyn Shelley
29 Jan 2024
13 min read
Save for later

AI Distilled 34: Empowering Education Through AI

Merlyn Shelley
29 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“The real power that AI brings to education is connecting our learning intelligently to make us smarter in the way we understand ourselves, the world and how we teach and learn.” - Rose Luckin, UCL professor, Co-founder, Institute for Ethical AI in Education AI makes learning more inclusive and personalized than ever before. Recent advancements including the launch of Microsoft’s AI-powered Reading Coach and OpenAI’s first-of-its-kind partnership with the Arizona State University will ensure the future of learning is bright. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across sectors: To begin with, 💎 Explore Packt's New Year, New Data Upskilling program – Meet the Datapro Mini Library: an essential, user-friendly platform you can't afford to overlook. AI Launches & Industry Updates:  AI Will Not Displace Humans Anytime Soon, Says MIT Study Voice Cloning Startup ElevenLabs Raises $80 Million, Achieves Unicorn Status Samsung Introduces New AI Features in Galaxy Phones AI Graphic Design Startup Recraft Raises $12 Million OpenAI CEO Looking to Establish Own AI Chip Factories Meta CEO Mark Zuckerberg Enters Race to Build AGI AI in Education: OpenAI Signs Deal with Arizona State University Microsoft Makes AI-Powered Reading Coach Freely Available AI in Healthcare:  AI to Save Asia-Pacific Healthcare $100 Billion Annually by 2025 WHO Releases Guidance on Ensuring Ethics of Powerful AI Models AI in Finance: Survey Finds Majority of Finance Leaders Believe AI Will Boost Productivity Singapore Fintech Startup Secures Series A Funding to Automate Accounting AI in Supply Chain Management: AI and Supply Chain Changes Top Priorities for Apparel Brands in 2024 We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:  Discover New Methods for Aligning Chatbots New Framework Helps AI Systems Evaluate Their Own Answers Making Sense of Time: Understanding the Mathematical Underpinnings of Recurrent Neural Networks Detecting Deception: New Methods to Uncover AI Untruths Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:  How to Use RAGxplorer to Help Make Sense of AI Data How to Create a Multi-Modal Nutrition Tool How to Combine Language Models Using Raspberry Pi with Offline Speech and Language Models Looking for some inspiration? Here are some GitHub repositories to get your projects going!  huggingface/nanotron tencentarc/visft linkdd/aitoolkit FlagOpen/TACO  📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." 📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives✨Packt's 2024 Specials✨Discover Packt's New Year, New Data Upskilling program, designed for data professionals. Gain a competitive edge in data science and analytics with expert-curated resources. Our goal? To help you seamlessly upgrade your skills in the most efficient way possible, enabling you to switch between topics without losing your stride. Introducing the Datapro Mini Library: a smooth, user-friendly platform that you simply can't afford to miss. Here’s what our DataPro platform offers: On-Demand Learning: Immerse yourself in Packt’s comprehensive data-based knowledge base, featuring hundreds of books, video courses, research papers, and articles. Expert Problem Solving: Get bespoke solutions to your most challenging problems, directly from our vast network of data experts and authors. Advanced Self-Assessment: Utilize our tools for skill gap analysis and progress tracking, pinpointing areas for improvement and tracking your learning journey. Personalized DataPro Dashboard: Keep tabs on your activities, revisit recent learning sections, and receive tailored recommendations to align with your learning objectives. Skill Gap Analysis: Deep dive into your SQL, R, Python, and other skills with detailed quizzes and personalized feedback. The icing on the cake? Join the thriving community of more than 150 data/AI professionals in our Discord channel. Get exclusive access to our DataPro beta program, and even have a chance to win Amazon gift cards! All this is available for just $7.99 per month. Remember Benjamin Franklin's words, "An investment in knowledge pays the best interest." There’s no better time to invest in your professional growth than now. Don't miss this opportunity to power up your data journey. Subscribe now and take the first step towards becoming a data mastermind!Sign Up Here ⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates💎 AI Will Not Displace Humans Anytime Soon, Says MIT Study: A MIT study explored the potential impact of AI, particularly computer vision, on jobs involving visual analysis. The findings suggest that only 23% of wages in these jobs are cost-effective to automate with current AI. Job displacement is expected to be gradual, taking decades to significantly affect employment levels, contrary to some earlier predictions. 💎 Voice Cloning Startup ElevenLabs Raises $80 Million, Achieves Unicorn Status: ElevenLabs, a voice AI startup, secured $80 million in Series B funding, reaching a $1 billion valuation. Their tech creates realistic voices from text or samples, targeting audiobooks, dubbing, and gaming. While investors highlight its potential, ethical and legal concerns persist regarding voice cloning. 💎 Samsung Introduces New AI Features in Galaxy Phones: Samsung's Galaxy S24 smartphones now offer AI translation features with up to 13 languages. Users can call, text, and translate live audio and text using Google's Gemini AI model, ensuring private and secure on-device translations, aiding international communication and travelers. 💎 AI Graphic Design Startup Recraft Raises $12 Million: London's Recraft, an AI graphic design startup, secures $12 million in Series A funding, led by Khosla Ventures and Nat Friedman. Their platform helps brands create visuals from text prompts. With 300,000 users, Recraft aims to develop its own graphic design foundation model, potentially reducing the need for designers as the global design AI market is expected to reach $7.75 billion by 2032. 💎 OpenAI CEO Looking to Establish Own AI Chip Factories: OpenAI CEO Sam Altman is seeking billions in investment, including $8 billion from G42, to establish his own AI-specific ASIC factories due to concerns about semiconductor foundries' ability to meet future AI chip demand. This move aims to secure OpenAI's access to specialized AI processors and promote industry self-reliance in chip design and manufacturing. 💎 Meta CEO Mark Zuckerberg Enters Race to Build AGI: Meta CEO Mark Zuckerberg aims to develop artificial general intelligence (AGI), bolstered by 600,000 GPUs by 2024. He plans to integrate AGI into Meta apps and share models openly, though closure is an option if safety or strategic concerns arise in the pursuit of superhuman intelligence. AI in Education💎 OpenAI Signs Deal with Arizona State University: OpenAI signed a deal with Arizona State University to bring its ChatGPT AI chatbot to ASU researchers, staff and faculty. This indicates shifting views on using AI in education as the technology advances. AI has potential benefits for helping students but concerns about plagiarism linger largely unaddressed. 💎 Microsoft Makes AI-Powered Reading Coach Freely Available: Microsoft offers free access to its AI-based Reading Coach for users with Microsoft accounts. The tool offers personalized reading practice with features like text-to-speech, but experts emphasize the irreplaceable role of teachers in assessing comprehension. AI in Healthcare💎 AI to Save Asia-Pacific Healthcare $100 Billion Annually by 2025: IDC predicts generative AI will save 10% of clinician time in Asia-Pacific (excluding Japan) by 2025, leading to $100 billion in healthcare savings. By 2027, half of healthcare organizations will double AI investments for personalized care. Other forecasts include 30% adopting virtualized work models by 2025 and 60% emphasizing "techquity" partnerships to bridge digital divides. IDC anticipates the next five years shaping a patient-centric, AI-driven healthcare future in the region. 💎 WHO Releases Guidance on Ensuring Ethics of Powerful AI Models: The WHO releases guidelines for Large Multi-Modal Models (LMMs) in healthcare, highlighting their potential and risks. Over 40 recommendations address responsible development, oversight, and equitable use, emphasizing diversity and safety to protect users and promote health equity. AI in Finance💎 Survey Finds Majority of Finance Leaders Believe AI Will Boost Productivity: A survey by OneStream found 80% of financial decision-makers believe AI will increase productivity in finance departments within five years. AI streamlines data management and improves forecasting, despite challenges like training and data privacy. Finance leaders see AI as a key part of their operations. 💎 Singapore Fintech Startup Secures Series A Funding to Automate Accounting: Singapore-based AI accounting startup Bluesheets secured $6.5 million in a Series A round led by Illuminate Financial Management, with support from Antler. Bluesheets, founded in 2020, uses ML to simplify financial workflows for businesses, serving 10,000+ customers globally. Despite generating $180,000 in revenue last year, the company incurred $2.39 million in losses while expanding its platform. AI in Supply Chain Management💎 AI and Supply Chain Changes Top Priorities for Apparel Brands in 2024: A survey of 250 apparel and fashion executives reveals that top tech priorities include using AI for marketing and financial forecasting. Many plan to increase onshoring and invest in automation, while also opening and closing stores to focus on smaller formats.  🔮 Expert Insights from Packt Community 💎 Unlocking the Secrets of Prompt Engineering - By Gilbert Mizrahi "Unlocking the Secrets of Prompt Engineering" is your go-to guide to mastering AI-driven writing with large language models (LLMs). Learn prompt fundamentals, apply LLMs for content creation, chatbots, and coding. Explore practical use cases, from product descriptions to creative writing. Dive into advanced applications, ethics, and best practices. Unlock AI's full potential in writing and boost productivity. Get your copy now and transform your writing skills with AI. 💎 Building LLM Apps - By Valentina Alto This is your comprehensive guide to Large Language Models (LLMs). It covers LLM fundamentals, architectural frameworks like GPT 3.5/4 and Falcon LLM, and introduces LangChain. Learn to create intelligent agents, retrieve unstructured data, and engage with structured data using LLMs. Explore the future of Large Foundation Models (LFMs) extending AI capabilities beyond language. Whether you're an AI expert or newcomer, this book is your roadmap to unleash the power of LLMs. Access the book now and shape the future of intelligent machines. 💎 Machine Learning for Time Series - Second Edition - By Ben Auffarth This latest book offers an elaborative guide to Python time-series packages, aiding in the creation of predictive systems. Covering traditional autoregressive models to modern non-parametric ones, this edition explains loading time-series data, deep learning, convolutional networks, and gradient boosting. New additions include financial market forecasting and case studies. Master time-series analysis with machine learning. Take the first step towards mastering time series analysis - get your copy now.  🌟 Secret Knowledge: AI/LLM Resources💎 Discover New Methods for Aligning Chatbots: Hugging Face researchers tested three methods to enhance conversational AI assistants without reinforcement learning: Direct Preference Optimization, Identity Preference Optimization, and Kahneman-Tversky Optimization. Tuning hyperparameters, especially beta, proved crucial for better performance in multi-turn conversations. 💎 New Framework Helps AI Systems Evaluate Their Own Answers: Google researchers created ASPIRE to enhance LLMs' self-confidence assessment. It fine-tunes models and trains them to self-evaluate. Test results show ASPIRE improves error identification and smaller models using it outperform larger ones. It's a step toward more trustworthy AI in decision-making. 💎 Making Sense of Time: Understanding the Mathematical Underpinnings of Recurrent Neural Networks: Discover the math behind Recurrent Neural Networks (RNNs), which excel in analyzing sequences like time series. The author explains RNN equations, shows how to build one from scratch in Python, and demonstrates their use in predicting stock prices, revealing their ability to capture time-based patterns. 💎 Detecting Deception: New Methods to Uncover AI Untruths: Researchers at Kolena used various methods to spot inaccuracies in LLM-generated responses. They achieved over 90% accuracy in detecting errors with context. More techniques like self-consistency testing and involving another AI improved accuracy.  🔛 Masterclass: AI/LLM Tutorials💎 How to Use RAGxplorer to Help Make Sense of AI Data: Discover RAGxplorer, a web app for understanding AI data. Upload documents to see how they're analyzed in chunks and their connections to questions. It unveils insights into retrieval-augmented generation (RAG) and is a promising tool for exploring AI training datasets. 💎 How to Create a Multi-Modal Nutrition Tool: Learn how to develop a smart food journal to help track nutrition and diet goals. The journal allows users to take pictures of meals which are then analyzed using GPT-4 Vision to provide nutritional information. Autogen helps rapidly build the application by leveraging LLMs. A user-friendly interface was created with Gradio. 💎 How to Combine Language Models: Combine ML models to create a versatile AI. The article explains techniques like weighted averaging and dealing with parameter conflicts. Learn to merge Mistral, WizardMath, and CodeLlama using the mergekit toolkit. 💎 Raspberry Pi with Offline Speech and Language Models: Discover how to enable AI on a Raspberry Pi without internet. Learn to make the tiny device understand and respond to speech using locally stored LLMs. The article guides setting up Whisper and fine-tuning GPT-2 on the Pi, showing an affordable offline AI solution.  🚀 HackHub: Trending AI Tools💎 huggingface/nanotron: Tools for efficiently distributing LLM training across multiple processors via 3D parallelism techniques. 💎 tencentarc/visft: Two-stage training technique called ViSFT to improve large foundation models on visual tasks. 💎 linkdd/aitoolkit: The AI Toolkit library provides C++ tools like finite state machines, behavior trees, utility AI and goal-oriented action planning to help developers create intelligent non-player characters for their games.  💎 FlagOpen/TACO: Topics in Algorithmic COde generation is a dataset containing over 25,000 programming problems to evaluate state-of-the-art models.
Read more
  • 0
  • 0
  • 179

article-image-ai-distilled-33-tech-revolution-2024-ais-impact-across-industries
Merlyn Shelley
22 Jan 2024
13 min read
Save for later

AI Distilled 33: Tech Revolution 2024: AI's Impact Across Industries

Merlyn Shelley
22 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“This year, every industry will become a technology industry. You can now recognize and learn the language of almost anything with structure, and you can translate it to anything with structure — so text-protein, protein-text. This is the generative AI revolution.” -Jensen Huang, NVIDIA founder and CEO. AI is revolutionizing drug development and reshaping medical tech with cutting-edge algorithms. Dive into the latest AI_Distilled edition for sharp insights on AI's impact across industries, including breakthroughs in machine learning, NLP, and more. AI Launches & Industry Updates:  OpenAI Revises Policy, Opening Doors to Military Applications Google Cloud Introduces Advanced Generative AI Tools for Retail Enhancement Google Confirms Significant Layoffs Across Core Teams OpenAI Launches ChatGPT Team for Collaborative Workspaces Microsoft Launches Copilot Pro Plan and Expands Business Availability Vodafone and Microsoft Forge 10-Year Partnership for Digital Transformation AI in Healthcare:  MIT Researchers Harness AI to Uncover New Antibiotic Candidates Google Research Unveils AMIE: AI System for Diagnostic Medical Conversations NVIDIA CEO Foresees Tech Transformation Across All Industries in 2024 AI in Finance: AI Reshapes Financial Industry: 2024 Trends Unveiled in Survey JPMorgan Seeks AI Strategist to Monitor London Startups AI in Fintech Market to Surpass $222.49 Billion by 2030 AI in Business: AI to Impact 40% Jobs Globally, Balanced Policies Needed, Says IMF Deloitte's Quarterly Survey Reveals Business Leaders' Concerns About Gen AI's Societal Impact and Talent Shortage AI in Science & Technology:  NASA Boosts Scientific Discovery with Generative AI-Powered Search Swarovski Unveils World's First AI Binoculars AI in Supply Chain Management: AI Proves Crucial in Securing Healthcare Supply Chains: Economist Impact Study Unlocking Supply Chain Potential: Generative AI Transforms Operations We’ve also got you your fresh dose of LLM, GPT, and Gen AI secret knowledge and tutorials: How to Craft Effective AI Prompts Understanding and Managing KV Caching for LLM Inference Understanding and Enhancing Chain-of-Thought (CoT) Reasoning with Graphs Unlocking the Power of Hybrid Deep Neural Networks We know how much you love hands-on tips and strategies from the community, so here they are: Building a Local Chatbot with Next.js, Llama.cpp, and ModelFusion How to Build an Anomaly Detector with OpenAI Building Multilingual Financial Search Applications with Cohere Embedding Models in Amazon Bedrock Maximizing GPU Utilization with AWS ParallelCluster and EC2 Capacity Blocks Don’t forget to review these GitHib repositories that have been doing rounds:  vanna-ai/vanna dvmazur/mixtral-offloading pootiet/explain-then-translate genezc/minima   📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: 💎 OpenAI Revises Policy, Opening Doors to Military Applications: OpenAI updated its policy, lifting the ban on using its tech for military purposes, aiming for clarity and national security discussions. However, it maintains a strict prohibition against developing and using weapons. 💎 Google Cloud Introduces Advanced Generative AI Tools for Retail Enhancement: Google Cloud has released new AI tools to improve online shopping and help retail businesses. This includes a smart chatbot for websites and apps to help customers, a feature to make product searches better, and tools to improve customer service and speed up listing products. 💎 Google Confirms Significant Layoffs Across Core Teams: Google announced major job cuts affecting its Hardware, core engineering, and Google Assistant teams, totaling around a thousand layoffs in a day. The exact number might be higher, but no total count was provided. 💎 OpenAI Launches ChatGPT Team for Collaborative Workspaces: ChatGPT Team is a plan for teams offering a secure space with advanced models like GPT-4 and DALL·E 3. It includes tools for data analysis and lets users create custom GPTs, ensuring business data remains private. 💎 Microsoft Launches Copilot Pro Plan and Expands Business Availability: Copilot Pro, at $20/month per user, offers enhanced text, command, and image features in Microsoft 365 apps, plus early access to new GenAI models. It's also available for businesses on various Microsoft 365 and Office 365 plans. 💎 Vodafone and Microsoft Forge 10-Year Partnership for Digital Transformation: Vodafone and Microsoft have formed a 10-year partnership to serve over 300 million people in Europe and Africa, using Microsoft's AI to improve customer experiences, IoT, digital services for small businesses, and global data center strategies. AI in Healthcare: 💎 MIT Researchers Harness AI to Uncover New Antibiotic Candidates: MIT researchers have employed deep learning to identify a new class of antibiotic compounds capable of combating drug-resistant bacterium Methicillin-resistant Staphylococcus aureus (MRSA). Published in Nature, the study underscores researchers' ability to unveil the deep-learning model's criteria for antibiotic predictions, paving the way for enhanced drug design. 💎 Google Research Unveils AMIE: AI System for Diagnostic Medical Conversations: Google Research introduces the Articulate Medical Intelligence Explorer (AMIE), an AI system tailored for diagnostic reasoning and conversations in the medical field. AMIE, based on LLMs, focuses on replicating the nuanced and skilled dialogues between clinicians and patients, addressing diagnostic challenges. The system employs a unique self-play simulated learning environment, refining its diagnostic capabilities across various medical conditions. 💎 NVIDIA CEO Foresees Tech Transformation Across All Industries in 2024: Jensen Huang predicts a tech revolution in all industries by 2024, focusing on generative AI's impact. At a healthcare conference, he highlighted AI's role in language and translation, and NVIDIA's shift from aiding drug discovery to designing drugs with computers. AI in Finance: 💎 AI Reshapes Financial Industry: 2024 Trends Unveiled in Survey: NVIDIA's survey reveals 91% of financial companies are adopting or planning to use AI. 55% are interested in generative AI and LLMs, mainly to enhance operations, risk, and marketing. 97% intend to increase AI investments for new uses and workflow optimization. 💎 JPMorgan Seeks AI Strategist to Monitor London Startups: JPMorgan is hiring an 'AI Strategy Consultant' in London to identify and assess startups using Generative AI and LLMs, reporting to the Chief Data and Analytics Officer. This aligns with financial trends like HSBC's launch of Zing, a money transfer app. 💎 AI in Fintech Market to Surpass $222.49 Billion by 2030: The AI in Fintech market, valued at $13.23 billion in 2022, is growing fast. It's improving financial services with data analytics and machine learning, enhancing decision-making and security. It's projected to reach $222.49 billion by 2030, growing at 42.3% annually.  AI in Business: 💎 AI to Impact 40% Jobs Globally, Balanced Policies Needed, Says IMF: The IMF warns that AI affects 40% of global jobs, posing more risks and opportunities in advanced economies than emerging ones. It may increase income inequality, calling for social safety nets, retraining, and AI-focused policies to ensure inclusivity. 💎 Deloitte's Quarterly Survey Reveals Business Leaders' Concerns About Gen AI's Societal Impact and Talent Shortage: Deloitte's new quarterly survey, based on input from 2,800 professionals globally, shows 79% are optimistic about gen AI's impact on their businesses in 3 years. However, over 50% fear it may centralize global economic power and worsen economic inequality.  AI in Science & Technology:  💎 NASA Boosts Scientific Discovery with Generative AI-Powered Search: NASA introduces the Science Discovery Engine, powered by generative AI, simplifying access to its extensive data. Developed by the Open Source Science Initiative (OSSI) and Sinequa, it comprehends 9,000 scientific terms, offers contextual search, and enables natural language queries for 88,000 datasets and 715,000 documents from 128 sources. 💎 Swarovski Unveils World's First AI Binoculars: Swarovski Optik and designer Marc Newson launch AX VISIO, the first AI binoculars. They merge analog optics with AI, instantly identifying 9,000+ species, boasting a camera-like design, and enabling quick photo and video capture through a neural processing unit.  AI in Supply Chain Management: 💎 AI Proves Crucial in Securing Healthcare Supply Chains: Economist Impact Study: A study by Economist Impact, with DP World's support, finds 46% of healthcare firms use AI to predict supply chain issues. Amid geopolitical uncertainties, 39% use "friendshoring" for trade, and 23% optimize suppliers, showcasing industry adaptability. 💎 Unlocking Supply Chain Potential: Generative AI Transforms Operations: About 40% of supply chains invest in Gen AI for knowledge management. It's widely adopted (62%) for sustainability tracking and helps with forecasting, production, risk management, manufacturing design, predictive maintenance, and logistics efficiency.  🔮 Expert Insights from Packt Community Generative AI with LangChain - By Ben Auffarth How do GPT models work? Generative pre-training has been around for a while, employing methods such as Markov models or other techniques. However, language models such as BERT and GPT were made possible by the transformer deep neural network architecture (Vaswani and others, Attention Is All You Need, 2017), which has been a game-changer for NLP. Designed to avoid recursion to allow parallel computation, the Transformer architecture, in different variations, continues to push the boundaries of what’s possible within the field of NLP and generative AI. Transformers have pushed the envelope in NLP, especially in translation and language understanding. Neural Machine Translation (NMT) is a mainstream approach to machine translation that uses DL to capture long-range dependencies in a sentence. Models based on transformers outperformed previous approaches, such as using recurrent neural networks, particularly Long Short-Term Memory (LSTM) networks. The transformer model architecture has an encoder-decoder structure, where the encoder maps an input sequence to a sequence of hidden states, and the decoder maps the hidden states to an output sequence. The hidden state representations consider not only the inherent meaning of the words (their semantic value) but also their context in the sequence. The encoder is made up of identical layers, each with two sub-layers. The input embedding is passed through an attention mechanism, and the second sub-layer is a fully connected feed-forward network. Each sub-layer is followed by a residual connection and layer normalization. The output of each sub-layer is the sum of the input and the output of the sub-layer, which is then normalized. The architectural features that have contributed to the success of transformers are: Positional encoding: Since the transformer doesn’t process words sequentially but instead processes all words simultaneously, it lacks any notion of the order of words. To remedy this, information about the position of words in the sequence is injected into the model using positional encodings. These encodings are added to the input embeddings representing each word, thus allowing the model to consider the order of words in a sequence. Layer normalization: To stabilize the network’s learning, the transformer uses a technique called layer normalization. This technique normalizes the model’s inputs across the features dimension (instead of the batch dimension as in batch normalization), thus improving the overall speed and stability of learning. Multi-head attention: Instead of applying attention once, the transformer applies it multiple times in parallel – improving the model’s ability to focus on different types of information and thus capturing a richer combination of features. This is an excerpt from the book Generative AI with LangChain - By Ben Auffarth and published in Dec ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources💎 How to Craft Effective AI Prompts: Embark on a journey to understand the intricacies of AI prompts and how they can revolutionize creative content generation. Delve into the workings of AI Prompts, powered by NLP algorithms, and uncover the steps involved in their implementation. 💎 Understanding and Managing KV Caching for LLM Inference: Explore the intricacies of KV caching in the inference process of LLMs in this post. The KV cache, storing key and value tensors during token generation, poses challenges due to its linear growth with batch size and sequence length. The post delves into the memory constraints, presenting calculations for popular MHA models. 💎 Understanding and Enhancing Chain-of-Thought (CoT) Reasoning with Graphs: Explore using graphs to advance Chain-of-Thought (CoT) prompting, boosting reasoning in GPT-4. CoT enables multi-step problem-solving, spanning math to puzzles, vital for enhancing language models. 💎 Unlocking the Power of Hybrid Deep Neural Networks: This article explains Hybrid Deep Neural Networks (HDNNs), advanced ML models changing AI. It covers HDNN architecture, uses, benefits, and future trends, including how they combine various neural networks like CNNs, RNNs, and GANs.  🔛 Masterclass: AI/LLM Tutorials💎 Building a Local Chatbot with Next.js, Llama.cpp, and ModelFusion: Discover how to build a chatbot with Next.js, Llama.cpp, and ModelFusion. This tutorial covers setup, using Llama.cpp for LLM inference in C++, and creating a chatbot base with Next.js, TypeScript, ESLint, and Tailwind CSS. 💎 How to Build an Anomaly Detector with OpenAI: Learn to build an anomaly detector for different data types, including text and numbers, that fits into your data pipeline. The guide starts with the importance of anomaly detection and OpenAI's LLM role, using OpenAI and BigQuery.  💎 Building Multilingual Financial Search Applications with Cohere Embedding Models in Amazon Bedrock: Learn to use Cohere's multilingual model on Amazon Bedrock for advanced financial search tools. Unlike traditional keyword-based methods, Cohere uses machine learning for semantic searches in over 100 languages, improving document analysis and information retrieval. 💎 Maximizing GPU Utilization with AWS ParallelCluster and EC2 Capacity Blocks: Discover how to tackle GPU shortages in machine learning with AWS ParallelCluster and EC2 Capacity Blocks. This guide outlines a three-step method: reserve Capacity Block, configure your cluster, and run jobs effectively, including GPU failure management and multi-queue optimization.  🚀 HackHub: Trending AI Tools💎 vanna-ai/vanna: Toolkit for accurate Text-to-SQL generation via LLMs using RAG to interact with SQL databases through chat.  💎 dvmazur/mixtral-offloading: Achieve efficient inference for Mixtral-8x7B models, utilizing mixed quantization with HQQ for attention layers and experts, along with a MoE offloading strategy. 💎 pootiet/explain-then-translate: 2-stage Chain-of-Thought (CoT) prompting technique for program translation to improve translation across various Python-to-X and X-to-X directions. 💎 genezc/minima: Addresses the challenge of distilling knowledge from large teacher LMs to smaller student ones to optimize the capacity gap for effective LM distillation and achieving competitive performance with resource-efficient models. 
Read more
  • 0
  • 0
  • 591

article-image-ai-distilled-32-navigating-industry-updates-and-innovations
Merlyn Shelley
12 Jan 2024
13 min read
Save for later

AI_Distilled #32: Navigating Industry Updates and Innovations

Merlyn Shelley
12 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“There is not going to be one model to rule them all. You need to be trying out different models, you need a real choice of model providers.”  -Adam Selipsky, CEO, AWS. There’s no one-size-fits-all approach in AI development. When you embrace diversity in AI, that’s when it truly shines. There’s also a different side to the coin — the infinitely scalable adaptability of AI to revolutionize field after field, such as when it can help discover promising new sustainable battery materials to potentially reduce reliance on Lithium. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across different industries and sectors: AI Launches & Industry Updates: Explore the GPT MarketplaceNVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024 Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos OpenAI Set to Launch GPT Store for AI Models and Apps Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S. Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models AI in Healthcare: Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant AI in Business: Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move Walmart Revolutionizes Shopping with Generative AI Innovations AI in Science & Technology: Microsoft and PNNL Harness AI to Discover Promising Battery Material German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience AI in Finance: Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks AI in Supply Chain Management: Warehousing Industry Leverages Machine Learning to Tackle Disruptions We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024 Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage A Comprehensive Guide to Merging LLMs AI Drift in Retrieval Augmented Generation Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: Creating Your Own AI Image Generator App with Generative AI Optimizing Code Output with CodeWhisperer Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta How to Craft an Open Source Multi-Modal RAG System Looking for some inspiration? Here are some GitHub repositories to get your projects going!gxnu-zhonglab/odtrack DLYuanGod/TinyGPT-V intel/intel-extension-for-transformers CambioML/pykoi  📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book!  Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: ⭐ Explore the GPT Marketplace: Just two months in, 3 million custom ChatGPTs are already out there! The GPT Store is now open to ChatGPT Plus, Team, and Enterprise users, offering a variety of handy GPTs. Get in on the action at chat.openai.com/gpts! ⭐ NVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024: NVIDIA unveiled impressive CES 2024 innovations: GeForce RTX 40 SUPER GPUs, AI laptops, generative AI tools. They highlighted RTX GPUs' influence on generative AI, introduced TensorRT acceleration for Stable Diffusion XL and SDXL Turbo, and NVIDIA Avatar Cloud Engine (ACE) Microservices for digital avatars. Getty Images and Nvidia introduced Generative AI by iStock, a text-to-image platform for customized stock photos. ⭐ Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos: San Francisco's Perplexity AI secures $73.6 million in funding led by IVP, with Nvidia and Jeff Bezos participating, valuing the company at $520 million. Despite serving 500 million queries in 2023, profitability remains elusive, as it competes with Google in the search market. The funds will be used for hiring and product development. ⭐ OpenAI Set to Launch GPT Store for AI Models and Apps: OpenAI is set to launch the GPT Store, where developers can present custom GPT model applications, following updated policies. The launch, previously delayed, offers diverse, code-free applications. Revenue-sharing details await clarification. ⭐ Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S.: Google is facing a federal jury trial in Boston as Singular Computing alleges patent infringement in its AI processors. Singular seeks up to $7 billion in damages, while Google argues independent development. The trial may last two to three weeks. ⭐ Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models: DeepMind Robotics unveils AutoRT, a system enhancing robot understanding of human intentions using Visual Language Models. It orchestrates 20 robots, suggesting tasks via LLMs and introduces RT-Trajectory with 63% success in 41 tasks using video input. AI in Healthcare: ⭐ Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis: London-based Isomorphic, a DeepMind spin-out, forms strategic alliances with Eli Lilly and Novartis, valued at $3 billion. Utilizing AlphaFold 2 AI technology, Isomorphic focuses on accurate protein predictions for innovative drug discovery. ⭐ Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant: Paris startup Nabla secures $24 million in a Series B funding round led by Cathay Innovation and ZEBOX Ventures. Nabla develops an AI copilot for doctors, streamlining administrative tasks while collaborating with physicians. AI in Business: ⭐ Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move: Deloitte is using a chatbot called PairD to help 75,000 employees in Europe and the Middle East with everyday tasks. While it's convenient, there are concerns about its accuracy, so employees still check its work. Deloitte is also sharing PairD with 800 workers at the charity Scope as part of its AI strategy. ⭐ Walmart Revolutionizes Shopping with Generative AI Innovations: Walmart introduces generative AI-powered features on iOS, Android, and its website to improve the digital shopping experience. These features provide personalized responses and recommendations, shifting from scrolling to goal-oriented searching for a smoother shopping journey. AI in Science & Technology: ⭐ Microsoft and PNNL Harness AI to Discover Promising Battery Material: Microsoft and PNNL used AI and cloud computing to speed up battery innovation, identifying a safer, efficient solid-state electrolyte with less lithium. Azure Quantum Elements platform screened 32 million candidates in 80 hours, highlighting a material with potential for a 70% reduction in sodium use, advancing sustainable energy solutions. ⭐ German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience: Leading German automakers like Volkswagen and Mercedes-Benz are revolutionizing the automotive industry with advanced AI integration. Volkswagen unveiled ChatGPT technology, enhancing the driving experience with AI-powered chatbots and IDA voice assistants, while Mercedes-Benz introduced a sophisticated virtual assistant for context-based suggestions, marking a significant leap in interactive AI utilization at CES 2024. AI in Finance: ⭐ Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks: The finance sector's growing use of generative AI is transforming services but raises concerns of misinformation. A study by PYMNTS Intelligence and AI-ID shows 80% of consumers worry about generative AI's misinformation risk. Regulatory guidelines, model explainability tools, and industry cooperation are essential for responsible AI adoption in finance. AI in Supply Chain Management: ⭐ Warehousing Industry Leverages Machine Learning to Tackle Disruptions: Zebra Technologies Corporation's research highlights the warehousing industry's adoption of AI, particularly machine learning (ML), amid challenges like inflation and labor shortages. The report predicts ML, predictive analytics, and mobile dimensioning will dominate by 2028, aiding historical analysis, demand prediction, and automation. Decision-makers aim to boost resilience with 94% planning ML integration within five years.  🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo Gensim and its NLP modeling techniques Gensim is actively maintained and supported by a community of developers and is widely used in academic research and industry applications. It covers many important NLP techniques that make up the workforce of today’s NLP. Last year, I was at a company’s year-end party. The ballroom was filled with people standing in groups with their drinks. I walked around and listened for conversation topics where I could chime in. I heard one group talking about the FIFA World Cup 2022 and another group talking about stock markets. I joined the stock markets conversation. In that short moment, my mind had performed “word extractions,” “text summarization,” and “topic classifications.” These tasks are the core tasks of NLP and what Gensim is designed to do. We perform serious text analyses in professional fields including legal, medical, and business. We organize similar documents into topics. Such work also demands “word extractions,” “text summarization,” and “topic classifications.” In the following sections, I will give you a brief introduction to the key models that Gensim offers so you will have a good overview. These models include the following: BoW and TF-IDF Latent semantic analysis/indexing (LSA/LSI) Word2Vec Doc2Vec Text summarization LDA Ensemble LDA  BoW and TF-IDF Texts can be represented as a bag of words, which is the count frequency of a word. BoW uses the word count to reflect the significance of a word. However, this is not very intuitive. Frequent words may not carry special meanings depending on the type of document. LSA/LSI Latent semantic analysis (LSA) was developed in the 1990s. It's an NLP solution that far surpasses naïve keyword matching and has become an important search engine algorithm. Prior to that, in 1988, an LSA-based information retrieval system was patented (US Patent #4839853, now expired) and named “latent semantic indexing,” so the technique is also called latent semantic indexing (LSI). Gensim and many other reports name LSA as LSI so as not to confuse LSA with LDA. This is an excerpt from the book The Handbook of NLP with Gensim - By Chris Kuo and published in OCT ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below.      Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024: In this guide, you'll learn how to navigate the dynamic realm of AI APIs, uncovering the capabilities of the top 9 for 2024. Discover Google Cloud Vision AI, an unparalleled eye for accurate image analysis, IBM Watson Assistant, a conversational genius transforming virtual assistance, Amazon Lex, empowering apps with voice commands effortlessly, Azure Cognitive Services, the Swiss Army knife of AI, offering diverse tools, DeepAI, simplifying deep learning for innovation, and decode texts with MonkeyLearn, a text analysis guru, among others. Read the post to explore how these APIs can shape your tech ventures and redefine the future of AI. ⭐ Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage: Discover how Splitwise, a technique from Azure Research - Systems, boosts LLM inference efficiency. It separates prompt computation and token-generation phases, optimizing hardware use. This method enhances GPU cluster design, achieving higher throughput, lower costs, and reduced power for efficient LLM deployment. ⭐ A Comprehensive Guide to Merging LLMs: This comprehensive guide explores merging LLMs using the mergekit library without requiring a GPU. It covers four merging techniques: SLERP, TIES, DARE, and passthrough, with configuration examples. The result is Marcoro14–7B-slerp, a high-performing model featured on the Open LLM Leaderboard. ⭐ AI Drift in Retrieval Augmented Generation (RAG): This guide delves into AI drift within RAG pipelines, drawing from a real case where a customer faced declining AI responses. It covers the causes (content drift, LLM drift, pipeline algorithm changes) and strategies (content management, API upgrades, internal metrics) to control AI drift.  🔛 Masterclass: AI/LLM Tutorials⭐ Creating Your Own AI Image Generator App with Generative AI: Discover how to build a powerful Generative AI Text-to-Image application in this detailed guide. The author shares their journey of seamlessly integrating AI-generated images into a React app, using third-party APIs like SegMind. With a step-by-step walkthrough, you'll explore the code behind the app on GitHub and learn how to choose the right API, integrate it into React, and unleash AI capabilities in web development. Read on to bring dynamic, AI-generated content to your React projects and stay at the forefront of web development innovation. ⭐ Optimizing Code Output with CodeWhisperer: Unlock the full potential of Amazon CodeWhisperer with this in-depth guide on prompt engineering. Learn how CodeWhisperer accelerates software development by offering code recommendations based on natural language comments. The post provides step-by-step insights on effective prompt engineering in Python, emphasizing best practices such as crafting specific and concise prompts, incorporating additional context, utilizing multiple comments strategically, and understanding CodeWhisperer's capacity for cross-file context. ⭐ Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta: Discover how to leverage LLMs with traditional NLP and ML methods to create knowledge graphs from unstructured text. The author showcases the synergy of KeyBERT, HDBSCAN, and Zephyr-7B-Beta for improved keyword extraction, clustering, and refinement. The guide covers dataset prep, keyword extraction, and LLM integration. ⭐ How to Craft an Open Source Multi-Modal RAG System: Discover building a Retrieval-Augmented Generation (RAG) system with an Open Source Large Language Multi-Modal (LLMM). Learn the integration of ChromeDB and Hugging Face, covering Clip, data storage, and MLLMs for user chat sessions in a detailed, dependency-free guide.  🚀 HackHub: Trending AI Tools⭐ gxnu-zhonglab/odtrack: Efficient video-level tracking pipeline utilizing online token propagation to densely capture contextual relationships and spatio-temporal trajectories across frames.  ⭐ DLYuanGod/TinyGPT-V: Features an efficient Multimodal Large Language Model using small backbones for efficiently incorporating multimodal capabilities into language models. ⭐ intel/intel-extension-for-transformers: Toolkit to accelerate GenAI/LLM performance on Intel platforms, including Gaudi2, CPU, and GPU, seamlessly compressing Transformer-based models, accessing optimized model packages, and using NeuralChat. ⭐ CambioML/pykoi: An open-source Python library for LLMs, enhancing them with RLHF, collecting user feedback, fine-tuning with reinforcement learning, comparing models, and creating RAG chatbots efficiently.  
Read more
  • 0
  • 0
  • 312
article-image-ai-distilled-31-evolving-boundaries-and-opportunities
Merlyn Shelley
08 Jan 2024
14 min read
Save for later

AI_Distilled #31: Evolving Boundaries and Opportunities

Merlyn Shelley
08 Jan 2024
14 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,🎉 Joyous 2024! Wishing you a year as delightful as your dreams!  Dive into the new year with our outstanding edition, filled with essential features to boost your AI practice. “The speed at which people will be able to come up with an idea, to test the idea, to make something, it’s going to be so accelerated…You don’t need to have a degree in computer science to do that.” - Matthew Candy, IBM’s global managing partner for generative AI Coding without coding is a revolutionary idea indeed. What might have been previously perceived as unbelievable is a living reality and new features like Github’s Copilot Chat make it all the more seamless.  The real possibilities of AI expand far beyond computing, with the technology making waves in healthcare, finance, and supply chain management. Starting from this edition, we’ll bring you fresh updates from each of these sectors, so stay tuned! Let's kick things off by tapping into the latest news and developments. AI Launches & Industry Updates:  Microsoft Copilot Integrates with GenAI Music App Suno for Song Composition Google Plans Potential Layoffs Amidst AI Integration in Ad Sales GitHub Expands Copilot Chat Availability for Developers AI in Healthcare: AI Streamlining Health Insurance Shopping Process Revolutionizing Healthcare with AI Stethoscope on Smartphones Generative AI's Impact on Mental Health Counseling AI in Finance: Next-Gen Banks to Leverage AI for Financial Influence and Support Invest Qatar Introduces Cutting-Edge Azure OpenAI GPT-Powered Chatbot AI in Supply Chain Management: AI Safeguards Supply Chains Amidst Holiday Challenges AI-SaaS Integration Revolutionizes E-commerce Analytics Here are some handpicked GPT and LLM resources, tutorials, and secret knowledge that’ll come in handy for your next project: Understanding the Prompt Development Life Cycle Building Platforms with LLMs: Overcoming Challenges in Summarization as a Service Understanding the Risks of Prompt Injection in LLM Applications Creating an Open Source LLM Recommender System: Mastering Prompt Iteration and Optimization Looking for hands-on tips and strategies straight from the developer community? We’ve got you covered: Exploring Google's Gemini Pro Vision LLM with Javascript: A Practical Guide Accelerating AI Application Productionization: A Guide with SageMaker JumpStart, Amazon Bedrock, and TruEra Quantizing LLMs with Activation-aware Weight Quantization (AWQ) Unlocking Your MacBook's AI Potential: Running 70B LLM Models Without Quantization Check out our curated list of smoking hot GitHub repositories: Giskard-AI/giskard CopilotKit/CopilotKit chengzeyi/stable-fast ml-explore/mlx  📥 Feedback on the Weekly EditionQ: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF. Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisNew Launches & Industry Updates: ⭐ Microsoft Copilot Integrates with GenAI Music App Suno for Song Composition: Microsoft Copilot has partnered with GenAI music app Suno, enabling users to create complete songs including lyrics, instrumentals, and singing voices. Accessible via Microsoft Edge, the integration aims to make music creation inclusive and enjoyable. However, ethical and legal concerns persist, with some artists uncomfortable with AI algorithms learning from their work without consent or compensation. Suno attempts to address such issues by blocking certain prompts and preventing the generation of covers using existing lyrics. Read Microsoft’s official blog here. ⭐ Google Plans Potential Layoffs Amidst AI Integration in Ad Sales: Google is reportedly considering laying off around 30,000 employees within its ad sales division due to the implementation of internal AI, aiming for improved operational efficiency. The restructuring primarily targets the ad sales team, reflecting Google's exploration of AI benefits in operational processes. Earlier in 2023, Google had already laid off 12,000 employees, emphasizing the need for organizational adaptation amidst evolving global dynamics. Read about other significant 2023 layoffs here. ⭐ GitHub Expands Copilot Chat Availability for Developers: GitHub is extending the availability of Copilot Chat, a programming-centric chatbot powered by GPT-4, to all users. The tool was initially launched for Copilot for Business subscribers and later in beta for $10 per month users. Integrated into Microsoft's IDEs, Visual Studio Code and Visual Studio, it's included in GitHub Copilot's paid tiers and free for verified teachers, students, and maintainers of specific open-source projects. Developers can prompt Copilot Chat in natural language, seeking real-time guidance on code-related tasks. Know more about Copilot Chat here. AI in Healthcare: ⭐ AI Streamlining Health Insurance Shopping Process: Companies are utilizing AI to simplify the often complex and tedious task of shopping for health insurance, aiming to guide consumers to better and more affordable options. With many Americans sticking to their health plans due to the difficulty of predicting their future healthcare needs, AI-powered tools gather individual information and predict the most suitable health plans. Alight, a cloud-based HR services provider, reports that 95% of its served employers use AI technology, including a virtual assistant, for employee health benefits selection.  ⭐ Revolutionizing Healthcare with AI Stethoscope on Smartphones: A startup AI Health Highway is addressing the challenge of limited access to specialists in healthcare by introducing an innovative solution, AI Steth, which combines traditional stethoscope use with cutting-edge signal processing and AI. Targeting the early detection and prediction of heart and lung disorders, the device transforms sound patterns into visual representations on smartphones, allowing non-specialists like family physicians and nurses to examine patients effectively. AI Steth has shown exceptional accuracy in murmur detection, paving the way for more objective and efficient diagnoses. Discover AI Health Highway’s work here. ⭐ Generative AI's Impact on Mental Health Counseling: Generative AI is finding use in mental health counseling, sparking discussions about its potential to assist or even replace human therapists. Recent research testing ChatGPT on mental health counseling questions has raised questions about the technology's role in therapy. AI therapy has evolved from basic chatbots to sophisticated entities capable of nuanced emotional responses, offering accessible mental health support 24/7. While the benefits are evident, challenges such as risk, coverage, and ethical considerations must be addressed for responsible implementation.  AI in Finance: ⭐ Next-Gen Banks to Leverage AI for Financial Influence and Support: Experts predict that next-generation banks will harness generative AI to impact various aspects of financial services, ranging from influencing customer decisions to identifying vulnerable clients. Tom Merry, Head of Banking Strategy at Accenture, suggests that generative AI could significantly influence banking operations, touching nearly every aspect. While the UK banking industry has been utilizing AI for fraud detection and risk analysis, the introduction of generative AI, capable of creating novel solutions based on extensive data, is gaining traction. ⭐ Invest Qatar Introduces Cutting-Edge Azure OpenAI GPT-Powered Chatbot: Invest Qatar, in collaboration with Microsoft, has launched Ai.SHA, an innovative AI-powered chatbot utilizing GPT capabilities through the Azure OpenAI service. This move positions Invest Qatar as a pioneer among investment promotion agencies globally, embracing advanced technology to transform interactions between investors and businesses in Qatar. Ai.SHA acts as a comprehensive resource, providing information on business opportunities, the investment ecosystem, and business setup in Qatar.  AI in Supply Chain Management: ⭐ AI Safeguards Supply Chains Amidst Holiday Challenges: Businesses face unique challenges in managing complex supply chains amid the holiday season, from counterfeit airplane parts to recalls affecting festive foods. The reliance on suppliers underscores the need for transparency and visibility to prevent disruptions caused by supplier misconduct. Leveraging AI in contracts offers a solution, allowing businesses to streamline due diligence, enhance visibility, conduct predictive analytics, and align with environmental, social, and governance (ESG) regulations. AI-powered contracts emerge as vital tools to proactively address supply chain challenges and ensure customer trust during the holiday season and beyond. ⭐ AI-SaaS Integration Revolutionizes E-commerce Analytics: In the logistics sector, where precision and speed are critical, SaaS coupled with AI is transforming traditional approaches. This integration allows for real-time data processing and learning from it, offering unprecedented insights and optimization capabilities. Learn how AI-SaaS integration streamlines inventory, boosts operational efficiency, and fortifies against fraud, becoming the recipe for e-commerce success in a hypercompetitive landscape.  🔮 Expert Insights from Packt Community Architectural Patterns and Techniques for Developing IoT Solutions - By Jasbir Singh Dhaliwal Unique requirements of IoT use cases IoT use cases tend to have very unique requirements concerning power consumption, bandwidth, analytics, and more. Additionally, the inherent complexity of IoT implementations (computationally challenged field devices on one end of the spectrum vis-à-vis almost infinite capacity of the cloud on the other) forces architects to make difficult architectural decisions and implementation choices. Before presenting the various IoT patterns, it is worth mentioning the unique expectations from IoT architectures that are different from non-IoT architectures: Sensing events and actuation commands have a wide range of latency expectations – from real-time to fire and forget. Data analysis results need to be reported/visualized/consumed on a variety of consumer devices – mobiles, desktops, tablets, and more. Similarly, data consumers have diverse backgrounds, data needs, and application roles (personas). One is often forced to integrate with legacy as well as cutting-edge devices and/or external systems – very few trivial use cases have isolated/standalone architectures. There is a considerable difference in the way the data is extracted from legacy versus non-legacy systems – legacy systems may internally collate the data and then push it to the external port (file transfer), whereas newer systems may push the data in a continuous stream (time-series data). This variability is one of the critical considerations when choosing a particular IoT architectural pattern. Varied deployment requirements – edge, on-premise, hybrid, the cloud, and more. Adherence to strict regulatory compliances, especially in medical and aeronautical domains. There are expectations considering immediate payback, return on investment (ROI), business outcomes, and new service business models. Continuous innovation, which results in new services or offerings (especially by cloud vendors), forcing IoT architectures to be in continuous sync mode with these new offerings or services. This is an excerpt from the book Architectural Patterns and Techniques for Developing IoT Solutions written by Jasbir Singh Dhaliwal and published in Sep ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below.   Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ Understanding the Prompt Development Life Cycle: Explore PDLC and gain insights into how prompt engineering mirrors software development. The primer unfolds a step-by-step guide, beginning with the Initial Build phase where an imperfect prompt is crafted, incorporating techniques like zero-shot and few-shot. The Optimization stage strategically refines prompts based on historical data. Finally, the Fine-tune phase demonstrates the refinement of models, emphasizing the importance of continuous tracking. ⭐ Building Platforms with LLMs: Overcoming Challenges in Summarization as a Service: Get to know more about Summarization as a Service, a platform designed by a Microsoft team for Viva Engage. Learn about the complexities of prompt design, ensuring accuracy and grounding, addressing privacy and compliance concerns, managing performance, cost, and availability of LLM services, and integrating outputs seamlessly with the Copilot and other Viva Engage features.  ⭐ Understanding the Risks of Prompt Injection in LLM Applications: Explore the intricacies of prompt injection in LLM applications. The author emphasizes the critical security implications and potential impacts, citing the OWASP Top 10 for LLM Applications. Drawing parallels to injection vulnerabilities like A03 in traditional security, the article illustrates potential risks through a thought experiment involving a robotic server.  ⭐ Creating an Open Source LLM Recommender System: Mastering Prompt Iteration and Optimization: Open Recommender is an open-source YouTube video recommendation system adept at tailoring content to your interests based on Twitter feed analysis. Discover its data pipeline, utilizing GPT-4, and the transition towards cost-effective open-source models using OpenPipe. Explore the challenges faced during prompt iteration, with a focus on better prompt engineering tools, including the introduction of a TypeScript library, Prompt Iteration Assistant.    🔛 Masterclass: AI/LLM Tutorials⭐ Exploring Google's Gemini Pro Vision LLM with Javascript: A Practical Guide: The blog introduces the concept of multi-modal LLMs capable of interpreting various data modes, including images. Learn how to utilize Google's multi-modal Gemini Pro Vision LLM with Javascript. The tutorial guides you through creating an AI-powered nutrition-fact explainer app using the newly released LLM. The tutorial covers prerequisites, such as installing node.js and obtaining a Gemini LLM API key.  ⭐ Accelerating AI Application Productionization: A Guide with SageMaker JumpStart, Amazon Bedrock, and TruEra: The post emphasizes the importance of observability in LLM applications and provides insights into evaluating responses for honesty, harmlessness, and helpfulness. You'll learn how to deploy, fine-tune, and iterate on foundation models for LLM applications using Amazon SageMaker JumpStart, Amazon Bedrock, and TruEra.  ⭐ Quantizing LLMs with Activation-aware Weight Quantization (AWQ): Explore the application of Activation-aware Weight Quantization (AWQ) to democratize LLMs like Llama-2, making them more accessible for deployment on regular CPUs or less powerful GPUs. The process involves setting up a GPU instance, installing necessary packages like AutoAWQ and transformers, and saving the quantized model. The tutorial further covers the model upload to the Hugging Face Model Hub and concludes with the successful reduction of the Llama-2 model from ~27GB to ~4GB, enhancing its efficiency for diverse applications. ⭐ Unlocking Your MacBook's AI Potential: Running 70B LLM Models Without Quantization: Discover how to unleash the hidden AI power of your 8GB MacBook as this post explores the latest 2.8 version of AirLLM. Without the need for quantization or model compression, an ordinary MacBook can now efficiently run top-tier 70 billion parameter models. Explore the MacBook's AI capabilities, understanding Apple's role in AI evolution through its M1, M2, and M3 series GPUs, which offer competitive performance in the era of generative AI. Gain insights into GPU capabilities, memory advantages, and the open-source MLX platform. 🚀 HackHub: Trending AI Tools⭐ Giskard-AI/giskard: Specialized testing framework for ML models, covering a range from tabular to LLMs. Developers can efficiently scan AI models using just four lines of code. ⭐ CopilotKit/CopilotKit: Build in-app AI chatbots that seamlessly interact with the app state, execute actions within the app, and communicate with both frontend, backend, and third-party services via plugins, serving as an AI "second brain" for users. ⭐ chengzeyi/stable-fast:  Leverage stable-fast for efficient and high-performance inference on various diffuser models while enjoying fast model compilation and out-of-the-box support for dynamic shape, LoRA, and ControlNet. ⭐ ml-explore/mlx: Array framework for machine learning on Apple silicon by Apple's ML research, offering familiar Python and C++ APIs closely aligned with NumPy and PyTorch. 
Read more
  • 0
  • 0
  • 232

article-image-detecting-addressing-llm-hallucinations-in-finance
James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Save for later

Detecting & Addressing LLM 'Hallucinations' in Finance

James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, The Future of Finance with ChatGPT and Power BI, by James Bryant, Alok Mukherjee. Enhance decision-making, transform your market approach, and find investment opportunities by exploring AI, finance, and data visualization with ChatGPT's analytics and Power BI's visuals.IntroductionLLMs, such as OpenAI’s GPT series, can sometimes generate responses that are referred to as “hallucinations.” These are instances where the output from the model is factually incorrect, it presents information that it could not possibly know (given it doesn’t have access to real-time or personalized data), or it might output something nonsensical or highly improbable.Let’s explore deeper into what hallucinations are, how to identify them, and what steps can be taken to mitigate their impact, especially in a context where accurate and reliable information is crucial, such as financial analysis, trading, or visual data presentations.Understanding hallucinationsLet’s look at some examples:Factual inaccuracies: Suppose an LLM provides information stating that Apple Inc. was founded in 1985. This is a clear factual inaccuracy because Apple was founded in 1976.Speculative statements: If an LLM were to suggest that “As of 2023, Tesla’s share price has hit $3,000,” this is a hallucination. The model doesn’t know real-time data and any post-2021 prediction or speculation it makes about specific stock prices is unfounded.Confident misinformation: For instance, if an LLM confidently states that “Amazon has declared bankruptcy in late 2022,” this is a hallucination and can have serious consequences if it’s acted upon without verification.How can we spot hallucinations?Here are some useful ways to spot hallucinations:Cross-verification: If an LLM suggests an unusual trading strategy, such as shorting a typically stable blue-chip stock based on some supposed insider information, always cross-verify this advice with other reliable sources or consult a financial advisor.Questioning the source: If an LLM claims that “our internal data shows a bullish trend for cryptocurrency X,” this is likely a hallucination. The model doesn’t have access to proprietary internal data.Time awareness: If the model provides information or trends post-September 2021 without the user explicitly asking for a hypothetical or simulated scenario, consider this a red flag. For example, GPT-4 giving specific “real-time” market cap values for companies in 2023 would be a hallucination.What can we do about hallucinations?Here are some ideas:Promote awareness: If you are developing an AI-assisted trading app that uses an LLM, ensure users are aware of potential hallucinations, perhaps with a disclaimer or notification upon usageImplement checks: You might integrate a news API that could help validate major financial events or claims made by the modelMinimizing hallucinations in the futureThere are various ways we can minimize hallucinations. Here are some examples:Training improvements: Imagine developing a better model that understands context and sticks to the known data more closely, avoiding speculative or incorrect financial statements. Future versions of the model could be specifically trained on financial data, news, and reports to understand the context and semantics of financial trading and investment better. We could do this to ensure that it understands a short squeeze scenario accurately, or is aware that penny stocks typically come with higher risks.Better evaluation metrics: For instance, develop a specific metric that calculates the percentage of the model’s outputs that were flagged as hallucinations during testing. In the development phase, the models could be evaluated on more focused tasks such as generating valid trading strategies or predicting the impact of certain macroeconomic events on stock prices. The better the model performs on these tasks, the lower the chance of hallucinations occurring.Post-processing methods: Develop an algorithm that cross-references model outputs against reliable financial data sources and flags potential inaccuracies. After the model generates a potential trading strategy or investment suggestion, this output could be cross-verified using a rules-based system. For instance, if the model suggests shorting a stock that has consistently performed well without any recent negative news or poor earnings reports, the system might flag this as a potential hallucination.As an example, you can use libraries such as yfinance or pandas_datareader to access real-time or historical financial data:!pip install yfinance pandas_datareader import yfinance as yf def get_stock_data(ticker, start, end): stock = yf.Ticker(ticker) data = stock.history(start=start, end=end) return data # Example Usage: data = get_stock_data("AAPL", "2021-01-01", "2023-01-01")You could also develop a cross-verification algorithm and compare the model’s outputs with the collected financial data to flag potential inaccuracies.Integration with real-time data: While creating Power BI visualizations, data that’s been pulled from the LLM could be cross-verified with real-time data from financial databases or APIs. Any discrepancies, such as inconsistent market share percentages or revenue growth rates, could be flagged. This reduces the risk of presenting hallucinated data in visualizations. Let’s look at some examples: Extracting real-time data: You can continue to use yfinance or pandas_datareader to extract real-time data Cross-verifying with real-time data: You can compare the model’s output with real-time data to identify discrepancies:def real_time_cross_verify(output, real_time_data): # Assume output is a dict with keys 'market_share', 'revenue_ growth', and 'ticker' ticker = output['ticker'] # Fetch real-time data (assuming a function get_real_time_ data is defined) real_time_data = get_real_time_ data(ticker) # Compare the model's output with real-time data if abs(output['market_share'] - real_time_data['market_ share']) > 0.05 or \ abs(output['revenue_growth'] - real_time_data['revenue_ growth']) > 0.05: return True # Flagged as a potential hallucination return False # Not flagged # Example Usage: output = {'market_share': 0.25, 'revenue_growth': 0.08, 'ticker': 'AAPL'} real_time_data = {'market_share': 0.24, 'revenue_growth': 0.07, 'ticker': 'AAPL'} flagged = real_time_cross_verify(output, real_time_data)User feedback loop: A mechanism can be incorporated to allow users to report potential hallucinations. For instance, if a user spots an error in the LLM’s output during a Power BI data analysis session, they can report this. Over time, these reports can be used to further train the model and reduce hallucinations.OpenAI is on the caseTo tackle the chatbot’s missteps, OpenAI engineers are working on ways for its AI models to reward themselves for outputting correct data when moving toward an answer, instead of rewarding themselves only at the point of conclusion. The system could lead to better outcomes as it incorporates more of a human-like chain-of-thought procedure, according to the engineers.These examples should help in illustrating the concept and risks of LLM hallucinations, particularly in high-stakes contexts such as finance. As always, these models should be seen as powerful tools for assistance, but not as a final authority.Trading examplesHallucination scenario: Let’s assume you’ve asked an LLM for a prediction on the future performance of a specific stock, let’s say Tesla. The LLM might generate a response that appears confident and factual, such as “Based on the latest earnings report, Tesla has declared bankruptcy.” If you acted on this hallucinated information, you might rush to sell Tesla shares only to find out that Tesla is not bankrupt at all. This is an example of a potentially disastrous hallucination.Action: Before making any trading decision based on the LLM’s output, always cross-verify the information from a reliable financial news source or the company’s official communications.Power BI visualization examplesHallucination scenario: Suppose you’re using an LLM to generate text descriptions for a Power BI dashboard that tracks the market share of different automakers in the EV market. The LLM might hallucinate and produce a statement such as “Rivian has surpassed Tesla in terms of global EV market share.” This statement might be completely inaccurate as Tesla had a significantly larger market share than Rivian.Action: When using LLMs to generate text descriptions or insights for your Power BI dashboards, it’s crucial to cross-verify any assertions that are made by the model. You can do this by cross-referencing the underlying data in your Power BI dashboard or by referring to reliable external sources of information.To minimize hallucinations in the future, the model can be fine-tuned with a dataset that’s been specifically curated to cover the relevant domain. The use of a structured validation set can help spot and rectify hallucinations during the model training process. Also, employing a robust fact-checking mechanism on the output of the model before acting on its suggestions or insights can help catch and rectify any hallucinations.Remember, while LLMs can provide valuable insights and suggestions, their output should always be used as one of many inputs in your decision-making process, particularly in high-stakes environments such as financial trading and analysis.ConclusionIn the dynamic world of financial analysis and data visualization, the presence of LLM 'hallucinations' poses a challenge. Awareness, verification, and ongoing improvement strategies stand as pillars against these inaccuracies. While LLMs offer invaluable support, their outputs must be scrutinized, verified, and used as one among many tools in decision-making. As we navigate this landscape, vigilance, continuous refinement, and a critical eye will fortify our ability to harness the power of LLMs while mitigating the risks they present in high-stakes financial contexts.Author BioJames Bryant, a finance and technology expert, excels at identifying untapped opportunities and leveraging cutting-edge tools to optimize financial processes. With expertise in finance automation, risk management, investments, trading, and banking, he's known for staying ahead of trends and driving innovation in the financial industry. James has built corporate treasuries like Salesforce and transformed companies like Stanford Health Care through digital innovation. He is passionate about sharing his knowledge and empowering others to excel in finance. Outside of work, James enjoys skiing with his family in Lake Tahoe, running half marathons, and exploring new destinations and culinary experiences with his wife and daughter.Aloke Mukherjee is a seasoned technologist with over a decade of experience in business architecture, digital transformation, and solutions architecture. He excels at applying data-driven solutions to real-world problems and has proficiency in data analytics and planning. Aloke worked at EMC Corp and Genentech and currently spearheads the digital transformation of Finance Business Intelligence at Stanford Health Care. In addition to his work, Aloke is a Certified Personal Trainer and is passionate about helping his clients stay fit. Aloke also has a passion for wine and exploring new vineyards. 
Read more
  • 0
  • 0
  • 308

article-image-ai-distilled-28-gen-ai-reshaping-industries-redefining-possibilities
Merlyn Shelley
15 Dec 2023
12 min read
Save for later

AI_Distilled #28: Gen AI - Reshaping Industries, Redefining Possibilities

Merlyn Shelley
15 Dec 2023
12 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Once in a while, technology comes along that is so powerful and so broadly applicable that it accelerates the normal march of economic progress. And like a lot of economists, I believe that generative AI belongs in that category.” - Andrew McAfee, Principal Research Scientist, MIT Sloan School of Management This vividly showcases the kaleidoscope of possibilities Gen AI unlocks as it emerges from its cocoon, orchestrating a transformative symphony across realms from medical science to office productivity. Take Google’s newly released AlphaCode 2, for example, which achieves human-level proficiency in programming, or Meta’s AudioBox, which pioneers next-generation audio production. Welcome to AI_Distilled #30, your ultimate guide to the latest advancements in AI, ML, NLP, and Gen AI. This week's highlights include: 📚 Unlocking the Secrets of Geospatial Data: Dive into Bonny P. McClain's new book, "Geospatial Analysis with SQL," and master the art of manipulating data across diverse geographical landscapes. Learn foundational concepts and explore advanced spatial algorithms for a transformative journey. 🌍 Let's shift our focus to the most recent updates and advancements in the AI industry: Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability Europe Secures Deal on World's First Comprehensive AI Rules Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads Elon Musk's xAI Chatbot Launches on X We’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials: A Primer on Enhancing Output Accuracy Using Multiple LLMs Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance Navigating Responsible AI: A Comprehensive Guide to Impact Assessment Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework Last but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse Fine-Tuning LLMs: A Comprehensive Guide Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart Mitigating Hallucinations with Retrieval Augmented Generation What’s more, we’ve also shortlisted the best GitHub repositories you should consider for inspiration: bricks-cloud/BricksLLM kwaikeg/kwaiagents facebookresearch/Pearl andvg3/LSDM Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQ: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF. Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  A quick heads-up: Our team is taking a well-deserved holiday break to recharge and return with fresh ideas. So, there'll be a pause in our weekly updates for the next two weeks. We're excited to reconnect with you in the new year, brimming with new insights and creativity. Wishing you a fantastic holiday season! See you in 2024! Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis⭐ Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers: Microsoft is partnering with the American Federation of Labor and Congress of Industrial Organizations, a coalition of 60 labor unions representing 12.5 million workers. They plan to discuss AI's impact on jobs, offer AI training to workers, and encourage unionization with "neutrality" terms. The goal is to improve worker collaboration, influence AI development, and shape policies for frontline workers' tech skills. ⭐ Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability: Meta researchers have unveiled Audiobox, an advanced audio generation model addressing limitations in existing models. It prioritizes controllability, enabling unique styles via text descriptions and precise management of audio elements. Audiobox excels in speech and sound generation, achieving impressive benchmarks like 0.745 similarity on Librispeech for text-to-speech and 0.77 FAD on AudioCaps for text-to-sound using description and example-based prompts. ⭐ Europe Secures Deal on World's First Comprehensive AI Rules: EU negotiators have achieved a historic agreement on the first-ever comprehensive AI rules, known as the Artificial Intelligence Act. It addresses key issues, such as generative AI and facial recognition by law enforcement, aiming to establish clear regulations for AI while facing criticism for potential exemptions and loopholes. ⭐ Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration: Google DeepMind has unveiled AlphaCode 2, a successor to its groundbreaking AI that writes code at a human level. It outperforms 85% of participants in 12 recent Codeforces contests, aiming to collaborate effectively with human coders and promote AI-human collaboration in programming, aiding problem-solving and suggesting code designs. ⭐ Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices: Stable LM Zephyr 3B is a 3 billion parameter lightweight language model optimized for edge devices. It excels in text generation, especially instruction following and Q&A, surpassing larger models in linguistic accuracy. It's ideal for copywriting, summarization, and content personalization on resource-constrained devices, with a non-commercial license. ⭐ Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI: Purple Llama is an initiative promoting trust and safety in generative AI. It provides tools like CyberSec Eval for cybersecurity benchmarking and Llama Guard for input/output filtering. Components are permissively licensed to encourage collaboration and standardization in AI safety tools. ⭐ Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads: Google Cloud has launched the powerful Cloud TPU v5p AI accelerator, addressing the needs of large generative AI models with 2X more FLOPS and 3X HBM. It trains models 2.8X faster than TPU v4 and is 4X more scalable. Google also introduced the AI Hypercomputer, an efficient supercomputer architecture for AI workloads, aiming to boost innovation in AI for enterprises and developers. ⭐ Elon Musk's xAI Chatbot Launches on X: Grok, created by xAI, debuts on X (formerly Twitter) for $16/month to Premium Plus subscribers. It offers conversational answers, similar to ChatGPT and Google's Bard. Grok-1 incorporates real-time X data, providing up-to-the-minute information. Elon Musk praises Grok's rebellious personality, though its intelligence remains comparable to other chatbots. Currently text-only, xAI intends to expand Grok's capabilities to include video, audio, and more.  🔮 Expert Insights from Packt Community Geospatial Analysis with SQL - By Bonny P McClain Embark on a captivating journey into geospatial analysis, a field beyond geography enthusiasts! This book reveals how combining geospatial magic with SQL can tackle real-world challenges. Learn to create spatial databases, use SQL queries, and incorporate PostGIS and QGIS into your toolkit. Key Concepts: 🌍 Foundations:    - Understand the importance of geospatial analysis.    - See how location info enhances data exploration. 🗺️ Tobler's Wisdom:    - Embrace Walter Tobler's second law of geography.    - Explore how external factors impact the area of interest. 🔍 SQL Spatial Data Science:    - Master geospatial analysis with SQL.    - Build databases, write queries, and use handy functions. 🛠️ Toolbox Upgrade:    - Boost skills with PostGIS and QGIS.    - Handle data questions and excel in spatial analysis. Decode geospatial secrets—perfect for analysts and devs seeking location-based insights! Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ A Primer on Enhancing Output Accuracy Using Multiple LLMs: Explore using chain-of-thought prompts with LLMs like GPT-4 and PaLM2 for varied responses. Learn the "majority-vote/quorum" technique to enhance accuracy by combining responses from different LLMs using AIConfig for streamlined coordination, improving output reliability and minimizing errors. ⭐ Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance: The authors explore innovative prompting techniques to improve the performance of GPT-4 and similar models, introducing "Medprompt" and related methods. They achieve a 90.10% accuracy on the MMLU challenge with "Medprompt+," sharing code on GitHub for replication and LLM optimization. ⭐ Navigating Responsible AI: A Comprehensive Guide to Impact Assessment: This article introduces the RAI impact assessment, emphasizing aligning AI with responsible principles. It mentions Microsoft's tools like the Responsible AI Standard, v2, RAI Impact Assessment Template, and Guide. The approach involves identifying use cases, stakeholders, harms, and risk mitigation. It suggests adapting RAI to organizational needs and phased alignment with product releases. ⭐ Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation: In the fourth installment of this tutorial series, the focus is on implementing RAG Fusion, a technique to improve Retrieval-Augmented Generation (RAG) applications. It involves converting user queries into multiple questions, searching for content in a knowledge base, and re-ranking results. The tutorial aims to enhance semantic search in RAG applications. ⭐ Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework: The article discusses challenges in making a production-ready RAG application, highlighting the need to assess retriever and generator components separately and together. It introduces the RAGAs framework for reference-free evaluation using LLMs, offering metrics for component-level assessment. The article provides a guide to using RAGAs for evaluation, including prerequisites, setup, data preparation, and conducting assessments. 🔛 Masterclass: AI/LLM Tutorials⭐ Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse: Discover how to build a multimodal conversational model using LLaVA-1.5, Hugging Face Transformers, and Runhouse. The post introduces the significance of multimodal conversational models, blending language and visual elements. It emphasizes the limitations of closed-source models, showcasing open-source alternatives. The tutorial includes Python code available on GitHub for deploying a vision chat assistant, providing a step-by-step guide. LLaVA-1.5, with its innovative visual embeddings, is explained, highlighting its lightweight training and impressive performance. The tutorial's implementation code, building a vision chatbot, is made accessible through standardized chat templates, and the Runhouse platform simplifies deployment on various infrastructures. ⭐ Fine-Tuning LLMs: A Comprehensive Guide: Explore the potential of fine-tuning OpenAI’s LLMs to revolutionize tasks such as customer support chatbots and financial data analysis. Learn how fine-tuning enhances LLM performance on specific datasets and discover use cases in customer support and finance. The guide walks you through the step-by-step process of fine-tuning, from preparing a training dataset to creating and using a fine-tuned model. Experience how fine-tuned LLMs, exemplified by GPT-3.5 Turbo, can transform natural language processing, opening new possibilities for diverse industries and applications. ⭐ Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart: Embark on a comprehensive guide to creating a web user interface, named Chat Studio, enabling seamless interaction with LLMs like Llama 2 and Stable Diffusion through Amazon SageMaker JumpStart. Learn how to deploy SageMaker foundation models, set up AWS Lambda, IAM permissions, and run the user interface locally. Explore optional extensions to incorporate additional foundation models and deploy the application using AWS Amplify. This step-by-step tutorial covers prerequisites, deployment, solution architecture, and offers insights into the potential of LLMs, providing a hands-on approach for users to enhance conversational experiences and experiment with diverse pre-trained LLMs on AWS. ⭐ Mitigating Hallucinations with Retrieval Augmented Generation: Delve into a step-by-step guide exploring the deployment of LLMs, specifically Llama-2 from Amazon SageMaker JumpStart. Learn the crucial technique of RAG using the Pinecone vector database to counteract AI hallucinations. The primer introduces source knowledge incorporation through RAG, detailing how to set up Amazon SageMaker Studio for LLM pipelines. Discover two approaches to deploy LLMs using HuggingFaceModel and JumpStartModel. The guide further illustrates querying pre-trained LLMs and enhancing accuracy by providing additional context.   🚀 HackHub: Trending AI Tools⭐ bricks-cloud/BricksLLM: Cloud-native AI gateway written in Go enabling the creation of API keys with fine-grained access controls, rate limits, cost limits, and TTLs for both development and production use. ⭐ kwaikeg/kwaiagents: Comprises KAgentSys-Lite with limited tools, KAgentLMs featuring LLMs with agent capabilities, KAgentInstruct providing finetuning data, and KAgentBench offering over 3,000 human-edited evaluations for testing agent capabilities. ⭐ facebookresearch/Pearl: Production-ready Reinforcement Learning AI agent library from Meta prioritizing long-term feedback, adaptability to diverse environments, and resilience to limited observability. ⭐ andvg3/LSDM: Official implementation of a NeurIPS 2023 paper on Language-driven Scene Synthesis using a Multi-conditional Diffusion Model. AI_Distilled Talkback: Unmasking the Community Buzz! 💬 Q: “How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?”  💭 "With providing more information on LLM."  Share your thoughts here! Your opinions matter—let's make this space a reflection of diverse perspectives.
Read more
  • 0
  • 0
  • 193
article-image-ai-distilled-28-unveiling-innovations-reshaping-our-world
Merlyn Shelley
11 Dec 2023
13 min read
Save for later

AI_Distilled #28: Unveiling Innovations Reshaping Our World

Merlyn Shelley
11 Dec 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Generative AI has the potential to change the world in ways that we can’t even imagine. It has the power to create new ideas, products, and services that will make our lives easier, more productive, and more creative. It also has the potential to solve some of the world’s biggest problems, such as climate change, poverty, and disease.” -Bill Gates, Microsoft Co-Founder Microsoft Bing’s new Deep Search functionality is a case in point — Bing will now create AI prompts itself to provide detailed insights to user queries in ways traditional search engines can’t even match. Who could have thought LLMs would progress so much they would eventually prompt themselves? Even Runway ML is onto something big with its groundbreaking technology that creates realistic AI generated videos that will find their way to Hollywood. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:  Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race Bing’s New Deep Search Expands Queries AI Takes Center Stage in 2023 Word of the Year Lists OpenAI Announces Delay in GPT Store Launch to Next Year ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers Efficient LLM Fine-Tuning with QLoRA on a Laptop Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance Understanding and Mitigating Biases and Toxicity in LLMs Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition Looking for some inspiration? Here are some GitHub repositories to get your projects going! tacju/maxtron Tanuki/tanuki.py roboflow/multimodal-maestro 03axdov/muskie Also, don't forget to check our expert insights column, which covers the interesting concepts of NLP from the book 'The Handbook of NLP with Gensim'. It's a must-read!    Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQuick question: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF.  Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis🏐 Elon Musk's xAI Initiates $1 Billion Funding Drive in AI Race: xAI is on a quest to secure $1 billion in equity, aiming to stay competitive with tech giants like OpenAI, Microsoft, and Google in the dynamic AI landscape. Already amassing $135 million from investors, xAI's total funding goal is disclosed in a filing with the US Securities and Exchange Commission.  🏐 AI Alliance Launched by Tech Giants IBM and Meta: IBM and Meta have formed a new "AI Alliance" with over 50 partners to promote open and responsible AI development. Members include Dell, Intel, CERN, NASA and Sony. The alliance envisions fostering an open AI community for researchers and developers and can help members make progress if they openly share models or not. 🏐 Bing’s New Deep Search Expands Queries: Microsoft is testing a new Bing feature called Deep Search that uses GPT-4 to expand search queries before providing results. Deep Search displays the expanded topics in a panel for users to select the one that best fits what they want to know. It then tailors the search results to that description. Microsoft says the feature can take up to 30 seconds due to the AI generation. 🏐 AI Takes Center Stage in 2023 Word of the Year Lists: In 2023, AI dominates tech, influencing "word of the year" choices. Cambridge picks "hallucinate" for AI's tendency to invent information; Merriam-Webster chooses "authentic" to address AI's impact on reality. Oxford recognizes "prompt" for its evolved role in instructing generative AI, reflecting society's increased integration of AI into everyday language and culture. 🏐 OpenAI Announces Delay in GPT Store Launch to Next Year: OpenAI delays the GPT store release until next year, citing unexpected challenges and postponing the initial December launch plan. Despite recent challenges, including CEO changes and employee unrest, development continues, and updates for ChatGPT are expected. The GPT store aims to be a marketplace for users to sell and share custom GPTs, with creators compensated based on usage. 🏐 ChatGPT Celebrates First Anniversary with 110M Installs and $30M Revenue Milestone: ChatGPT's mobile apps, launched in May 2023 on iOS and later on Android, have exceeded 110 million installs, yielding nearly $30 million in revenue. The success is fueled by the ChatGPT Plus subscription, offering perks. Despite competition, downloads surge, with Android hitting 18 million in a week. The company expects continued growth by year-end 2023. 🏐 Runway ML and Getty Images Collaborate on AI Video Models for Hollywood and Advertising: NYC video AI startup Runway ML, backed by Google and NVIDIA, announces a partnership with Getty Images for the Runway <> Getty Images Model (RGM), a generative AI video model. Targeting Hollywood, advertising, media, and broadcasting, it enables customized content workflows for Runway enterprise customers. 🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo NLU + NLG = NLP NLP is an umbrella term that covers natural language understanding (NLU) and NLG. We’ll go through both in the next sections. NLU Many languages, such as English, German, and Chinese, have been developing for hundreds of years and continue to evolve. Humans can use languages artfully in various social contexts. Now, we are asking a computer to understand human language. What’s very rudimentary to us may not be so apparent to a computer. Linguists have contributed much to the development of computers’ understanding in terms of syntax, semantics, phonology, morphology, and pragmatics. NLU focuses on understanding the meaning of human language. It extracts text or speech input and then analyzes the syntax, semantics, phonology, morphology, and pragmatics in the language. Let’s briefly go over each one: Syntax: This is about the study of how words are arranged to form phrases and clauses, as well as the use of punctuation, order of words, and sentences. Semantics: This is about the possible meanings of a sentence based on the interactions between words in the sentence. It is concerned with the interpretation of language, rather than its form or structure. For example, the word “table” as a noun can refer to “a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs” or a data frame in a computer language. NLU can understand the two meanings of a word in such jokes through a technique called word embedding.  Phonology: This is about the study of the sound system of a language, including the sounds of speech (phonemes), how they are combined to form words (morphology), and how they are organized into larger units such as syllables and stress patterns. For example, the sounds represented by the letters “p” and “b” in English are distinct phonemes. A phoneme is the smallest unit of sound in a language that can change the meaning of a word. Consider the words “pat” and “bat.” The only difference between these two words is the initial sound, but their meanings are different. Morphology: This is the study of the structure of words, including the way in which they are formed from smaller units of meaning called morphemes. It originally comes from “morph,” the shape or form, and “ology,” the study of something. Morphology is important because it helps us understand how words are formed and how they relate to each other. It also helps us understand how words change over time and how they are related to other words in a language. For example, the word “unkindness” consists of three separate morphemes: the prefix “un-,” the root “kind,” and the suffix “-ness.” Pragmatics: This is the study of how language is used in a social context. Pragmatics is important because it helps us understand how language works in real-world situations, and how language can be used to convey meaning and achieve specific purposes. For example, if you offer to buy your friend a McDonald’s burger, a large fries, and a large drink, your friend may reply "no" because he is concerned about becoming fat. Your friend may simply mean the burger meal is high in calories, but the conversation can also imply he may be fat in a social context. Now, let’s understand NLG. NLG While NLU is concerned with reading for a computer to comprehend, NLG is about writing for a computer to write. The term generation in NLG refers to an NLP model generating meaningful words or even articles. Today, when you compose an email or type a sentence in an app, it presents possible words to complete your sentence or performs automatic correction. These are applications of NLG.  This content is from the book The Handbook of NLP with Gensim - By Chris Kuo (Oct 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources🏀 Unlocking AI Magic: A Primer on 7 Essential Libraries for Developers: Discover seven cutting-edge libraries to enhance development projects with advanced AI features. From CopilotTextarea for AI-driven writing in React apps to PrivateGPT for secure, locally processed document interactions, explore tools that elevate your projects and impress users. 🏀 Efficient LLM Fine-Tuning with QLoRA on a Laptop: Explore QLoRA, an efficient memory-saving method for fine-tuning large language models on ordinary CPUs. The QLoRA API supports NF4, FP4, INT4, and INT8 data types for quantization, utilizing methods like LoRA and gradient checkpointing to significantly reduce memory requirements. Learn to implement QLoRA on CPUs, leveraging Intel Extension for Transformers, with experiments showcasing its efficiency on consumer-level CPUs. 🏀 Rapid Deployment of Large Open Source LLMs with Runpod and vLLM’s OpenAI Endpoint: Learn to swiftly deploy open-source LLMs into applications with a tutorial, featuring the Llama-2 70B model and AutoGen framework. Utilize tools like Runpod and vLLM for computational resources and API endpoint creation, with a step-by-step guide and the option for non-gated models like Falcon-40B. 🏀 Understanding Strategies to Enhance Retrieval-Augmented Generation (RAG) Pipeline Performance: Learn optimization techniques for RAG applications by focusing on hyperparameters, tuning strategies, data ingestion, and pipeline preparation. Explore improvements in inferencing through query transformations, retrieval parameters, advanced strategies, re-ranking models, LLMs, and prompt engineering for enhanced retrieval and generation. 🏀 Understanding and Mitigating Biases and Toxicity in LLMs: Explore the impact of ethical guidelines on Large Language Model (LLM) development, examining measures adopted by companies like OpenAI and Google to address biases and toxicity. Research covers content generation, jailbreaking, and biases in diverse domains, revealing complexities and challenges in ensuring ethical LLMs.  🔛 Masterclass: AI/LLM Tutorials🎯 A Step-by-Step Guide to Streamlining LLM Data Processing for Efficient Pipelines: Learn to optimize the development loop for your LLM-powered recommendation system by addressing slow processing times in data pipelines. The solution involves implementing a Pipeline class to save inputs/outputs, enabling efficient error debugging. Enhance developer experience with individual pipeline stages as functions and consider future optimizations like error classes and concurrency. 🎯 Fine-Tuning Mistral Instruct 7B on the MedMCQA Dataset Using QLoRA: Explore fine-tuning Mistral Instruct 7B, an open-source LLM, for medical entrance exam questions using the MedMCQA dataset. Utilize Google Colab, GPTQ version, and LoRA technique for memory efficiency. The tutorial covers data loading, prompt creation, configuration, training setup, code snippets, and performance evaluation, offering a foundation for experimentation and enhancement. 🎯 Accelerating Large-Scale Training: A Comprehensive Guide to Amazon SageMaker Data Parallel Library: This guide details ways to boost Large Language Model (LLM) training speed with Amazon SageMaker's SMDDP. It addresses challenges in distributed training, emphasizing SMDDP's optimized AllGather for GPU communication bottleneck, exploring techniques like EFA network usage, GDRCopy coordination, and reduced GPU streaming multiprocessors for improved efficiency and cost-effectiveness on Amazon SageMaker. 🎯 Enhancing LoRA-Based Inference Speed: A Guide to Efficient LoRA Decomposition: The article highlights achieving three times faster inference for public LoRAs using the Diffusers library. It introduces LoRA, a parameter-efficient fine-tuning technique, detailing its decomposition process and benefits, including quick transitions and reduced warm-up and response times in the Inference API.  🚀 HackHub: Trending AI Tools⚽ tacju/maxtron: Unified meta-architecture for video segmentation, enhancing clip-level segmenters with within-clip and cross-clip tracking modules. ⚽ Tanuki/tanuki.py: Simplifies the creation of apps powered by LLMs in Python by seamlessly integrating well-typed, reliable, and stateless LLM-powered functions into applications. ⚽ roboflow/multimodal-maestro: Empowers developers with enhanced control over large multimodal models, enabling the achievement of diverse outputs through effective prompting tactics. ⚽ 03axdov/muskie: Python-based ML library that simplifies the process of dataset creation and model utilization, aiming to reduce code complexity. 
Read more
  • 0
  • 0
  • 199

article-image-deploying-llms-with-amazon-sagemaker-part-2
Joshua Arvin Lat
30 Nov 2023
19 min read
Save for later

Deploying LLMs with Amazon SageMaker - Part 2

Joshua Arvin Lat
30 Nov 2023
19 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionIn the first part of this post, we showed how easy it is to deploy large language models (LLMs) in the cloud using a managed machine learning service called Amazon SageMaker. In just a few steps, we were able to deploy a MistralLite model in a SageMaker Inference Endpoint. If you’ve worked on real ML-powered projects in the past, you probably know that deploying a model is just the first step! There are definitely a few more steps before we can consider that our application is ready for use.If you’re looking for the link to the first part, here it is: Deploying LLMs with Amazon SageMaker - Part 1In this post, we’ll build on top of what we already have in Part 1 and prepare a demo user interface for our chatbot application. That said, we will tackle the following sections in this post:● Section I: Preparing the SageMaker Notebook Instance (discussed in Part 1)● Section II: Deploying an LLM using the SageMaker Python SDK to a SageMaker Inference Endpoint (discussed in Part 1)● Section III: Enabling Data Capture with SageMaker Model Monitor●  Section IV: Invoking the SageMaker inference endpoint using the boto3 client●  Section V: Preparing a Demo UI for our chatbot application●  Section VI: Cleaning UpWithout further ado, let’s begin!Section III: Enabling Data Capture with SageMaker Model MonitorIn order to analyze our deployed LLM, it’s essential that we’re able to collect the requests and responses to a central storage location. Instead of building our own solution that collects the information we need, we can just utilize the built-in Model Monitor capability of SageMaker. Here, all we need to do is prepare the configuration details and run the update_data_capture_config() method of the inference endpoint object and we’ll have the data capture setup enabled right away! That being said, let’s proceed with the steps required to enable and test data capture for our SageMaker Inference endpoint:STEP # 01: Continuing where we left off in Part 1 of this post, let’s get the bucket name of the default bucket used by our session:s3_bucket_name = sagemaker_session.default_bucket() s3_bucket_nameSTEP # 02: In addition to this, let’s prepare and define a few prerequisites as well:prefix = "llm-deployment" base = f"s3://{s3_bucket_name}/{prefix}" s3_capture_upload_path = f"{base}/model-monitor"STEP # 03: Next, let’s define the data capture config:from sagemaker.model_monitor import DataCaptureConfig data_capture_config = DataCaptureConfig(    enable_capture = True,    sampling_percentage=100,    destination_s3_uri=s3_capture_upload_path,    kms_key_id=None,    capture_options=["REQUEST", "RESPONSE"],    csv_content_types=["text/csv"],    json_content_types=["application/json"] )Here, we specify that we’ll be collecting 100% of the requests and responses that pass through the deployed model.STEP # 04: Let’s enable data capture so that we’re able to save in Amazon S3 the request and response data:predictor.update_data_capture_config(    data_capture_config=data_capture_config )Note that this step may take about 8-10 minutes to complete. Feel free to grab a cup of coffee or tea while waiting!STEP # 05: Let’s check if we are able to capture the input request and output response by performing another sample request:result = predictor.predict(input_data)[0]["generated_text"] print(result)This should yield the following output:"The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries. There is no single answer that can be definitively proven, as the meaning of life is subjective and can vary greatly from person to person.\n\nSome people believe that the meaning of life is to find happiness and fulfillment through personal growth, relationships, and experiences. Others believe that the meaning of life is to serve a greater purpose, such as through a religious or spiritual calling, or by making a positive impact on the world through their work or actions.\n\nUltimately, the meaning of life is a personal journey that each individual must discover for themselves. It may involve exploring different beliefs and perspectives, seeking out new experiences, and reflecting on what brings joy and purpose to one's life."Note that it may take a minute or two before the .jsonl file(s) containing the request and response data appear in our S3 bucket.STEP # 06: Let’s prepare a few more examples:prompt_examples = [    "What is the meaning of life?",    "What is the color of love?",    "How to deploy LLMs using SageMaker",    "When do we use Bedrock and when do we use SageMaker?" ] STEP # 07: Let’s also define the perform_request() function which wraps the relevant lines of code for performing a request to our deployed LLM model:def perform_request(prompt, predictor):    input_data = {        "inputs": f"<|prompter|>{prompt}</s><|assistant|>",        "parameters": {            "do_sample": False,            "max_new_tokens": 2000,            "return_full_text": False,        }    }      response = predictor.predict(input_data)    return response[0]["generated_text"] STEP # 08: Let’s quickly test the perform_request() function:perform_request(prompt_examples[0], predictor=predictor)STEP # 09: With everything ready, let’s use the perform_request() function to perform requests using the examples we’ve prepared in an earlier step:from time import sleep for example in prompt_examples:    print("Input:", example)      generated = perform_request(        prompt=example,        predictor=predictor    )    print("Output:", generated)    print("-"*20)    sleep(1)This should return the following:Input: What is the meaning of life? ... -------------------- Input: What is the color of love? Output: The color of love is often associated with red, which is a vibrant and passionate color that is often used to represent love and romance. Red is a warm and intense color that can evoke strong emotions, making it a popular choice for representing love. However, the color of love is not limited to red. Other colors that are often associated with love include pink, which is a softer and more feminine shade of red, and white, which is often used to represent purity and innocence. Ultimately, the color of love is subjective and can vary depending on personal preferences and cultural associations. Some people may associate love with other colors, such as green, which is often used to represent growth and renewal, or blue, which is often used to represent trust and loyalty. ...Note that this is just a portion of the overall output and you should get a relatively long response for each input prompt.Section IV: Invoking the SageMaker inference endpoint using the boto3 clientWhile it’s convenient to use the SageMaker Python SDK to invoke our inference endpoint, it’s best that we also know how to use boto3 as well to invoke our deployed model. This will allow us to invoke the inference endpoint from an AWS Lambda function using boto3.Image 10 — Utilizing API Gateway and AWS Lambda to invoke the deployed LLMThis Lambda function would then be triggered by an event from an API Gateway resource similar to what we have in Image 10. Note that we’re not planning to complete the entire setup in this post but having a working example of how to use boto3 to invoke the SageMaker inference endpoint should easily allow you to build an entire working serverless application utilizing API Gateway and AWS Lambda.STEP # 01: Let’s quickly check the endpoint name of the SageMaker inference endpoint:predictor.endpoint_nameThis should return the endpoint name with a format similar to what we have below:'MistralLite-HKGKFRXURT'STEP # 02: Let’s prepare our boto3 client using the following lines of code:import boto3 import json boto3_client = boto3.client('runtime.sagemaker')STEP # 03: Now, let’s invoke the endpointbody = json.dumps(input_data).encode() response = boto3_client.invoke_endpoint(    EndpointName=predictor.endpoint_name,    ContentType='application/json',    Body=body )   result = json.loads(response['Body'].read().decode())STEP # 04: Let’s quickly inspect the result:resultThis should give us the following:[{'generated_text': "The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries. There is no single answer that can be definitively proven, as the meaning of life is subjective and can vary greatly from person to person..."}] STEP # 05: Let’s try that again and print the output text:result[0]['generated_text']This should yield the following output:"The meaning of life is a philosophical question that has been debated by thinkers and philosophers for centuries..."STEP # 06: Now, let’s define perform_request_2 which uses the boto3 client to invoke our deployed LLM:def perform_request_2(prompt, boto3_client, predictor):    input_data = {        "inputs": f"<|prompter|>{prompt}</s><|assistant|>",        "parameters": {            "do_sample": False,            "max_new_tokens": 2000,            "return_full_text": False,        }    }      body = json.dumps(input_data).encode()    response = boto3_client.invoke_endpoint(        EndpointName=predictor.endpoint_name,        ContentType='application/json',        Body=body    )      result = json.loads(response['Body'].read().decode())    return result[0]["generated_text"]STEP # 07: Next, let’s run the following block of code to have our deployed LLM answer the same set of questions using the perform_request_2() function:for example in prompt_examples:    print("Input:", example)      generated = perform_request_2(        prompt=example,        boto3_client=boto3_client,        predictor=predictor    )    print("Output:", generated)    print("-"*20)    sleep(1)This will give us the following output:Input: What is the meaning of life? ... -------------------- Input: What is the color of love? Output: The color of love is often associated with red, which is a vibrant and passionate color that is often used to represent love and romance. Red is a warm and intense color that can evoke strong emotions, making it a popular choice for representing love. However, the color of love is not limited to red. Other colors that are often associated with love include pink, which is a softer and more feminine shade of red, and white, which is often used to represent purity and innocence. Ultimately, the color of love is subjective and can vary depending on personal preferences and cultural associations. Some people may associate love with other colors, such as green, which is often used to represent growth and renewal, or blue, which is often used to represent trust and loyalty. ... Given that it may take a few minutes before the .jsonl files appear in our S3 bucket, let’s wait for about 3-5 minutes before proceeding to the next section. Feel free to grab a cup of coffee or tea while waiting!STEP # 08: Let’s run the following block of code to list the captured data files stored in our S3 bucket:results = !aws s3 ls {s3_capture_upload_path} --recursive resultsSTEP # 09: In addition to this, let’s store the list inside the processed variable:processed = [] for result in results:    partial = result.split()[-1]    path = f"s3://{s3_bucket_name}/{partial}"    processed.append(path)   processedSTEP # 10: Let’s create a new directory named captured_data using the mkdir command:!mkdir -p captured_dataSTEP # 11: Now, let’s download the .jsonl files from the S3 bucket to the captured_data directory in our SageMaker Notebook Instance:for index, path in enumerate(processed):    print(index, path)    !aws s3 cp {path} captured_data/{index}.jsonlSTEP # 12: Let’s define the load_json_file() function which will help us load files with JSON content:import json def load_json_file(path):    output = []      with open(path) as f:        output = [json.loads(line) for line in f]          return outputSTEP # 13: Using the load_json_file() function we defined in an earlier step, let’s load the .jsonl files and store them inside the all variable for easier viewing:all = [] for i, _ in enumerate(processed):    print(f">: {i}")    new_records = load_json_file(f"captured_data/{i}.jsonl")    all = all + new_records     allRunning this will yield the following response:Image 11 — All captured data points inside the all variableFeel free to analyze the nested structure stored in all variables. In case you’re interested in how this captured data can be analyzed and processed further, you may check Chapter 8, Model Monitoring and Management Solutions of my 2nd book “Machine Learning Engineering on AWS”.Section V: Preparing a Demo UI for our chatbot applicationYears ago, we had to spend a few hours to a few days before we were able to prepare a user interface for a working demo. If you have not used Gradio before, you would be surprised that it only takes a few lines of code to set everything up. In the next set of steps, we’ll do just that and utilize the model we’ve deployed in the previous parts of our demo application:STEP # 01: Continuing where we left off in the previous part, let’s install a specific version of gradio using the following command:!pip install gradio==3.49.0STEP # 02: We’ll also be using a specific version of fastapi as well:!pip uninstall -y fastapi !pip install fastapi==0.103.1STEP # 03: Let’s prepare a few examples and store them in a list:prompt_examples = [    "What is the meaning of life?",    "What is the color of love?",    "How to deploy LLMs using SageMaker",    "When do we use Bedrock and when do we use SageMaker?",    "Try again",    "Provide 10 alternatives",    "Summarize the previous answer into at most 2 sentences" ]STEP # 04: In addition to this, let’s define the parameters using the following block of code:parameters = {    "do_sample": False,    "max_new_tokens": 2000, }STEP # 05: Next, define the process_and_response() function which we’ll use to invoke the inference endpoint:def process_and_respond(message, chat_history):    processed_chat_history = ""    if len(chat_history) > 0:        for chat in chat_history:            processed_chat_history += f"<|prompter|>{chat[0]}</s><|assistant|>{chat[1]}</s>"              prompt = f"{processed_chat_history}<|prompter|>{message}</s><|assistant|>"    response = predictor.predict({"inputs": prompt, "parameters": parameters})    parsed_response = response[0]["generated_text"][len(prompt):]    chat_history.append((message, parsed_response))    return "", chat_historySTEP # 06: Now, let’s set up and prepare the user interface we’ll use to interact with our chatbot:import gradio as gr with gr.Blocks(theme=gr.themes.Monochrome(spacing_size="sm")) as demo:    with gr.Row():        with gr.Column():                      message = gr.Textbox(label="Chat Message Box",                                 placeholder="Input message here",                                 show_label=True,                                 lines=12)            submit = gr.Button("Submit")                      examples = gr.Examples(examples=prompt_examples,                                   inputs=message)        with gr.Column():            chatbot = gr.Chatbot(height=900)      submit.click(process_and_respond,                 [message, chatbot],                 [message, chatbot],                 queue=False)Here, we can see the power of Gradio as we only needed a few lines of code to prepare a demo app.STEP # 07: Now, let’s launch our demo application using the launch() method:demo.launch(share=True, auth=("admin", "replacethis1234!"))This will yield the following logs:Running on local URL:  http://127.0.0.1:7860 Running on public URL: https://123456789012345.gradio.live STEP # 08: Open the public URL in a new browser tab. This will load a login page which will require us to input the username and password before we are able to access the chatbot.Image 12 — Login pageSpecify admin and replacethis1234! in the login form to proceed.STEP # 09: After signing in using the credentials, we’ll be able to access a chat interface similar to what we have in Image 13. Here, we can try out various types of prompts.Image 13 — The chatbot interfaceHere, we have a Chat Message Box where we can input and run our different prompts on the left side of the screen. We would then see the current conversation on the right side.STEP # 10: Click the first example “What is the meaning of life?”. This will auto-populate the text area similar to what we have in Image 14:Image 14 — Using one of the examples to populate the Chat Message BoxSTEP # 11:Click the Submit button afterwards. After a few seconds, we should get the following response in the chat box:Image 15 — Response of the deployed modelAmazing, right? Here, we just asked the AI what the meaning of life is.STEP # 12: Click the last example “Summarize the previous answer into at most 2 sentences”. This will auto-populate the text area with the said example. Click the Submit button afterward.Image 16 — Summarizing the previous answer into at most 2 sentencesFeel free to try other prompts. Note that we are not limited to the prompts available in the list of examples in the interface.Important Note: Like other similar AI/ML solutions, there's the risk of hallucinations or the generation of misleading information. That said, it's critical that we exercise caution and validate the outputs produced by any Generative AI-powered system to ensure the accuracy of the results.Section VI: Cleaning UpWe’re not done yet! Cleaning up the resources we’ve created and launched is a very important step as this will help us ensure that we don’t pay for the resources we’re not planning to use.STEP # 01: Once you’re done trying out various types of prompts, feel free to turn off and clean up the resources launched and created using the following lines of code:demo.close() predictor.delete_endpoint()STEP # 02: Make sure to turn off (or delete) the SageMaker Notebook instance as well. I’ll leave this to you as an exercise!Wasn’t that easy?! As you can see, deploying LLMs with Amazon SageMaker is straightforward and easy. Given that Amazon SageMaker handles most of the heavy lifting to manage the infrastructure, we’re able to focus more on the deployment of our machine learning model. We are just scratching the surface as there is a long list of capabilities and features available in SageMaker. If you want to take things to the next level, feel free to read 2 of my books focusing heavily on SageMaker: “Machine Learning with Amazon SageMaker Cookbook” and “Machine Learning Engineering on AWS”.Author BioJoshua Arvin Lat is the Chief Technology Officer (CTO) of NuWorks Interactive Labs, Inc. He previously served as the CTO of 3 Australian-owned companies and also served as the Director for Software Development and Engineering for multiple e-commerce startups in the past. Years ago, he and his team won 1st place in a global cybersecurity competition with their published research paper. He is also an AWS Machine Learning Hero and he has been sharing his knowledge in several international conferences to discuss practical strategies on machine learning, engineering, security, and management. He is also the author of the books "Machine Learning with Amazon SageMaker Cookbook", "Machine Learning Engineering on AWS", and "Building and Automating Penetration Testing Labs in the Cloud". Due to his proven track record in leading digital transformation within organizations, he has been recognized as one of the prestigious Orange Boomerang: Digital Leader of the Year 2023 award winners.
Read more
  • 0
  • 0
  • 243