How-To Tutorials

article-image-elevate-your-bi-dashboards-with-figma

28 Mar 2024

12 min read

Elevate Your BI Dashboards with Figma

28 Mar 2024

Subscribe to our BI Pro newsletter for the latest insights. Don't miss out – sign up today!Partnering with Figma Want to take your BI dashboards to the next level? Figma is the way to go! It's all about ramping up the design, making things work better, and giving your Power BI projects a real boost. With Figma, you'll speed up your projects, get more creative, and see better performance. So, why not give your reports a makeover with Figma? It's where design and data come together to make a big impact! Here's what Figma offers: ✅ Figma Professional: An all-in-one tool for seamless team collaboration. ✅ FigJam: Enables real-time teamwork and brainstorming. ✅ FigJam AI: Integrates ChatGPT for smarter collaboration. Guess what? You also have the Power BI UI Kit from the Figma Community! Sign Up Now! 👋 Hello,Welcome to BI-Pro #48, your ultimate guide to data and BI insights! 🚀In this issue: 🔮 Python Data Viz Matplotlib Data Visualization Seaborn: Visualizing Data in Python Use pandas for CSV Data Visualization Guides on SQL, Python, Data Cleaning, and Analysis Build An AI App with Python in 10 Steps ⚡ Industry Highlights Power BI Hybrid Workforce Experience Report Lakeview Dashboards Overview Grouping and Binning in Power BI Desktop Dashboards in Operations Manager Microsoft Fabric Analyze Dataverse Tables Bridging Fabric Lakehouses AWS Big Data Multicloud Analytics with Amazon Athena Analyze Fastly CDN Logs with QuickSight Google Cloud Data Spark Procedures in BigQuery Gemini Pro 1.0 in BigQuery via Vertex AI ✨ Expert Insights from Packt Community Unlocking the Secrets of Prompt Engineering 💡 BI Community Scoop Creating Interactive Power BI Dashboards Using Report Templates in Power BI Desktop 10 Analytics Dashboard Examples for SaaS Future of Data Storytelling: Actionable Intelligence Power BI: Transforming Banking Data Power BI vs Tableau vs Qlik Sense | 2024 Winner Get ready to supercharge your skills with BI-Pro! 🌟 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book!Share your thoughts and opinions here! Cheers,Merlyn ShelleyEditor-in-Chief, PacktSign Up | Advertise | Archives🚀 GitHub's Most Sought-After Repos🌀 sdv-dev/SDV: The Synthetic Data Vault (SDV) is a Python library that creates tabular synthetic data by learning patterns from real data using machine learning algorithms. 🌀 hyperspy/hyperspy: HyperSpy is a Python library for analyzing multidimensional datasets, making it easy to apply analytical procedures and access tools. 🌀 hi-primus/optimus: Optimus is a Python library for loading, processing, plotting, and creating ML models that works with pandas, Dask, cuDF, dask-cuDF, Vaex, or Spark. It simplifies data processing and offers various functions for data quality, plotting, and cross-platform compatibility. 🌀 mingrammer/diagrams: Diagrams simplifies cloud system architecture design in Python, supporting major providers and tracking changes in version control. 🌀 kayak/pypika: PyPika simplifies building SQL queries in Python with a flexible, easy-to-use interface, leveraging the builder design pattern for clean, efficient queries. Email Forwarded? Join BI-Pro Here!Partnering with Webflow Transform your BI reporting with Webflow Enterprise. Create visually stunning, scalable websites without coding, using a visual canvas.Seamlessly integrate with popular BI platforms and let Webflow handle the code.Start building smarter, faster, and more reliable websites for your data-driven decisions today! Get Started for Free! 🔮 Data Viz with Python Libraries 🌀 Matplotlib Data Visualization in Python: This blog introduces Matplotlib, a Python library for 2D visualizations, covering its capabilities and plot types like line, scatter, bar, histograms, and pie charts. It highlights Matplotlib's versatility, customization, and integration with other libraries, making it essential for data science and research. 🌀 Visualizing Data in Python With Seaborn: This article introduces the seaborn library for statistical visualizations in Python. It covers creating various plots, such as bar, distribution, and relational plots, using seaborn's functional and objects interfaces. It emphasizes seaborn's clear and concise code for effective data visualization. 🌀 Use pandas to Visualize CSV Data in Python: This blog discusses using the CData Python Connector for CSV with pandas, Matplotlib, and SQLAlchemy to analyze and visualize live CSV data in Python. It highlights the ease of integration and superior performance of the connector, along with step-by-step instructions for connecting to CSV data, executing SQL queries, and visualizing the results in Python. 🌀 Collection of Guides on Mastering SQL, Python, Data Cleaning, Data Wrangling, and Exploratory Data Analysis: This guide is tailored for business intelligence professionals new to data science, offering step-by-step instructions on mastering SQL, Python, data cleaning, wrangling, and exploratory analysis. It emphasizes practical skills for extracting insights and showcases essential tools and techniques for effective data analysis. 🌀 Build An AI Application with Python in 10 Easy Steps: This blog outlines a 10-step guide to building and deploying AI applications with Python, covering objectives, data collection, model selection, training, evaluation, optimization, web app development, cloud deployment, and sharing the AI model, with practical advice for each step. ⚡Stay Informed with Industry HighlightsPower BI 🌀 Hybrid Workforce Experience Power BI report: This tutorial explains using the Power BI Hybrid Workforce Experience report to analyze the impact of hybrid work models on employees working onsite, remotely, or in a hybrid manner. It covers setup, key metrics analysis, and improving employee experience, with prerequisites outlined. 🌀 What are Lakeview dashboards? This article discusses Lakeview dashboards, designed for creating and sharing data visualizations within teams. It highlights their advanced features, comparison with Databricks SQL dashboards, and dataset optimizations for better performance, including handling various dataset sizes and query efficiency. 🌀 Use grouping and binning in Power BI Desktop: This article explains how to use grouping and binning in Power BI Desktop to refine data visualization. Grouping allows you to combine data points into larger categories for clearer analysis, while binning lets you define the size of data chunks for more meaningful visualization. The article provides step-by-step instructions for creating, editing, and applying groups and bins to numerical and time fields, enhancing the exploration of data and trends in visuals. 🌀 Dashboards in Operations Manager: This article covers dashboard templates and widgets in Operations Manager, outlining their layouts and functions. It highlights various dashboard types, such as Service Level, Summary, and Object State, each with specific widgets. Users can create, share, and view dashboards across different consoles. Microsoft Fabric🌀 Analyze Dataverse tables from Microsoft Fabric: The article announces new features for Dynamics 365 and Power Apps customers, allowing easy integration of insights into Fabric. Users can now create shortcuts to Dataverse environments in Fabric for quick data access and analysis across multiple environments, enhancing business insights. 🌀 Bridging Fabric Lakehouses: Delta Change Data Feed for Seamless ETL. This article explains using Delta Tables and the Delta Change Data Feed in Microsoft Fabric for efficient data synchronization across lakehouses. It highlights Delta Tables' features and demonstrates updating tables across Silver and Gold Lakehouses in a medallion architecture. AWS BI 🌀 Multicloud data lake analytics with Amazon Athena: This post discusses creating a unified query interface using Amazon Athena connectors to seamlessly query across multiple cloud data stores, simplifying analytics in organizations with data spread over different clouds. It also explores managing analytics costs using Athena workgroups and cost allocation tags. 🌀 How to Analyze Fastly Content Delivery Network Logs with Amazon QuickSight Powered by Generative BI? This post discusses using Fastly, a content delivery network (CDN), to enhance web performance and security. It highlights creating a dashboard with Amazon QuickSight for analyzing CDN logs, using AWS services like S3 and Glue for data storage and cataloging. Google Cloud Data 🌀 Apache Spark stored procedures in BigQuery are GA: BigQuery now supports Apache Spark stored procedures, enabling users to integrate Spark-based data processing with BigQuery's SQL capabilities. This simplifies using Spark within BigQuery, allowing seamless development, testing, and deployment of PySpark code, and installation of necessary packages in a unified environment. 🌀 Gemini Pro 1.0 available in BigQuery through Vertex AI: This post advocates for a unified platform to bridge data and AI teams, ensuring smooth workflows from data ingestion to ML training. It introduces BigQuery ML, enabling ML model creation, training, and execution in BigQuery using SQL. It supports various models, including Vertex AI-trained ones like PaLM 2 and Gemini Pro 1.0, and enables sharing trained models, promoting governed data usage and easy dataset discovery. Gemini Pro 1.0 integration into BigQuery via Vertex AI simplifies generative AI, enhancing collaboration, security, and governance in data workflows. ✨ Expert Insights from Packt CommunityUnlocking the Secrets of Prompt Engineering - By Gilbert Mizrahi Exploring LLM parameters LLMs such as OpenAI’s GPT-4 consist of several parameters that can be adjusted to control and fine-tune their behavior and performance. Understanding and manipulating these parameters can help users obtain more accurate, relevant, and contextually appropriate outputs. Some of the most important LLM parameters to consider are listed here: Model size: The size of an LLM typically refers to the number of neurons or parameters it has. Larger models can be more powerful and capable of generating more accurate and coherent responses. However, they might also require more computational resources and processing time. Users may need to balance the trade-off between model size and computational efficiency, depending on their specific requirements. Temperature: The temperature parameter controls the randomness of the output generated by the LLM. A higher temperature value (for example, 0.8) produces more diverse and creative responses, while a lower value (for example, 0.2) results in more focused and deterministic outputs. Adjusting the temperature can help users fine-tune the balance between creativity and consistency in the model’s responses. Top-k: The top-k parameter is another way to control the randomness and diversity of the LLM’s output. This parameter limits the model to consider only the top “k” most probable tokens for each step in generating the response. For example, if top-k is set to 5, the model will choose the next token from the five most likely options. By adjusting the top-k value, users can manage the trade-off between response diversity and coherence. A smaller top-k value generally results in more focused and deterministic outputs, while a larger top-k value allows for more diverse and creative responses. Max tokens: The max tokens parameter sets the maximum number of tokens (words or subwords) allowed in the generated output. By adjusting this parameter, users can control the length of the response provided by the LLM. Setting a lower max tokens value can help ensure concise answers, while a higher value allows for more detailed and elaborate responses. Prompt length: While not a direct parameter of the LLM, the length of the input prompt can influence the model’s performance. A longer, more detailed prompt can provide the LLM with more context and guidance, resulting in more accurate and relevant responses. However, users should be aware that very long prompts can consume a significant portion of the token limit, potentially truncating the model’s output. Discover more insights from 'Unlocking the Secrets of Prompt Engineering' by Gilbert Mizrahi. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today! Read Here💡 What's the Latest Scoop from the BI Community? 🌀 Creating Interactive Power BI Dashboards That Engage Your Audience: This blog discusses the challenges faced by stakeholders and clients unfamiliar with using dashboards, preferring traditional tools like Excel. It emphasizes the importance of creating user-friendly and interactive dashboards to bridge this gap, offering techniques to enhance engagement and accessibility.🌀 Create and use report templates in Power BI Desktop: This tutorial explains how to create and use report templates in Power BI Desktop, enabling users to streamline report creation and standardize layouts, data models, and queries. Templates, saved with the .PBIT extension, help jump-start and share report creation processes across an organization. 🌀 10 Analytics Dashboard Examples to Gain Data Insights for SaaS: This article discusses the importance of analytics dashboards in simplifying the tracking of SaaS metrics and extracting insights. It provides 10 examples of analytics dashboards, including web, digital marketing, and user behavior, and highlights the top 5 analytics tools. The article emphasizes the need for clear, customizable, and intuitive dashboards for effective decision-making. 🌀 The Future of Data Storytelling: Actionable Intelligence [AI, Power BI, and Office]: This blog post discusses Zebra BI's solutions for reporting, planning, and presenting, emphasizing the importance of clarity, consistency, and actionability in data visualization. It introduces the concept of a reporting-planning-presenting cycle and highlights upcoming features and innovations, including the integration of AI. The post also mentions Zebra BI's adherence to the IBCS standard for clear and consistent business communication. 🌀 Power BI: Transforming Banking Data. This blog post discusses how Power BI can help banks analyze complex data for better decision-making. It covers challenges in banking, how Power BI integrates data sources, develops dashboards, and optimizes analytics. Benefits include improved operations, customer experience, risk management, and cost savings. 🌀 Power BI vs Tableau vs Qlik Sense | Which Wins In 2024? This blog compares Power BI, Tableau, and Qlik Sense for business intelligence (BI) and analytics. It highlights Power BI's advantages in data management, Tableau's strong visualization capabilities, and Qlik Sense's modern self-service platform. The article concludes with a comparison of features and recommendations for different needs. See you next time!Affiliate Disclosure: This newsletter contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. This supports our work and helps us keep providing useful content. We only recommend products and services we think will benefit our readers. Thanks for your support!

0
0
1967

Merlyn Shelley

26 Mar 2024

14 min read

Transforming Web Data with Browse AI

Merlyn Shelley

26 Mar 2024

14 min read

Subscribe to our Data Pro newsletter for the latest insights. Don't miss out – sign up today!Partnering with Browse AI Turn Web Data into Your Business Superpower!👉 Train a robot in 2 minutes, no coding needed. 🤖 👉 Ideal for web scraping and data monitoring. 🌐 Here’s what you get: Monitor Websites for Changes ✅ Download Data from Any Website ✅ Turn Any Website into an API ✅ Product data extraction ✅ Also, extract data from news, stocks, jobs, social media, and more. Check out this 1-minute explainer video on how to extract data to Excel, Airtable, and connect to 5,000+ apps using Zapier! Start for free with up to 50 credits, and for a limited time, enjoy free setup and onboarding for Team and Company plans, saving up to 20% on Annual plans. Get Scraping Today!👋 Hello,Welcome to DataPro#85 – Your one-stop shop for the latest in Data Science and ML Algorithms! 🚀 In this issue:⚙️ Keeping Up with LLMs & GPTs Meet Devin: The pioneering AI software engineer. Google's Croissant: A fresh take on metadata for ML-ready datasets. INSTRUCTIR by Kaist AI: Setting new standards in instruction-following for information retrieval models. Spyx by Sussex AI: Turbocharging spiking neural networks with just-in-time compiled optimization. SynCode by VMware: Enhancing LLM code generation with a touch of grammar. Chatbot Arena: The ultimate battleground for evaluating LLMs by human preference. Apollo: Bringing medical AI to the masses with a multilingual medical LLM. ✨ On the RadarTop AI tools for code generation in 2024. Setting up a Pypi mirror in AWS with Terraform. Ensuring safer code changes with custom pre-commit hooks. Deciphering the AQLM Quantization Algorithm. AI's role in revolutionizing web browsing. Tackling tensors through three tricky errors. Running RStudio inside a container. Harnessing PyTorch and MLX for Apple Silicon. 🏭 Industry Highlights Google Research: Boosting LLMs with Cappy, evolving tables with Chain-of-table, and Scalable Instructable Multiworld Agent (SIMA). AWS: Streamlining code review with generative AI using Amazon Bedrock. OpenAI Updates: Leadership continuity and global news partnerships. 📚 New in Packt Library Practical Guide to Applied Conformal Prediction in Python by Valery Manokhin. DataPro Newsletter is not just a publication; it’s a comprehensive toolkit for anyone serious about mastering the ever-changing landscape of data and AI. Grab your copy and start transforming your data expertise today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."We appreciate your input and hope you enjoy the book!Share your Feedback!Cheers,Merlyn ShelleyEditor-in-Chief, PacktSign Up | Advertise | Archives🔰 GitHub Finds: Any of These Repos in Your Toolbox?🛠️ deepseek-ai/DeepSeek-VL: Open-source Vision-Language (VL) model for real-world tasks, handling logical diagrams, web pages, formulas, scientific literature, and more. 🛠️ OpenGVLab/VideoMamba: VideoMamba enhances 3D CNNs and video transformers, excelling in long-term video understanding with scalability and modality compatibility. 🛠️ showlab/DragAnything: DragAnything uses entity representation for motion control in video generation, offering user-friendly interaction and outperforming existing methods. 🛠️ pkunlp-icler/FastV: FastV accelerates large vision language models by pruning redundant visual tokens, achieving 45% FLOPs reduction without performance loss. 🛠️ cnulab/RealNet: RealNet introduces SDAS for anomaly strength control, AFS for feature selection, and RRS for anomaly region identification. Partnering with SurfsharkSurfshark is allowing our readers to enjoy a full 2 years of their award-winning VPN protection for 79% off, plus 2 months free. With Surfshark One, you get: Unlimited devices and connections ✅ One account for the entire household ✅ Your online activity, made safe, secure, and invisible ✅ Plus, identity protection, ad blocking, antivirus, and data breach monitoring.Claim your VPN protection today! 📚 Expert Insights from Packt CommunityPractical Guide to Applied Conformal Prediction in Python - By Valery Manokhin Basic components of a conformal predictor We will now look at the basic components of a conformal predictor: Nonconformity measure: The nonconformity measure is a function that evaluates how much a new data point differs from the existing data points. It compares the new observation to either the entire dataset (in the full transductive version of conformal prediction) or the calibration set (in the most popular variant – ICP. The selection of the nonconformity measure is based on a particular machine learning task, such as classification, regression, or time series forecasting, as well as the underlying model. This will examine several nonconformity measures suitable for classification and regression tasks. Calibration set: The calibration set is a portion of the dataset used to calculate nonconformity scores for the known data points. These scores are a reference for establishing prediction intervals or regions for new test data points. The calibration set should be a representative sample of the entire data distribution and is typically randomly selected. The calibration set should contain a sufficient number of data points (at least 500). If the dataset is small and insufficient to reserve enough data for the calibration set, the user should consider other variants of conformal prediction – including TCP (see, for example, Mastering Classical Transductive Conformal Prediction in Action – https://medium.com/@valeman/how-to-use-full-transductive-conformal-prediction-7ed54dc6b72b). Test set: The test set contains new data points for generating predictions. For every data point in the test set, the conformal prediction model calculates a nonconformity score using the nonconformity measure and compares it to the scores from the calibration set. Using this comparison, the conformal predictor generates a prediction region that includes the target value with a user-defined confidence level. All these components work in tandem to create a conformal prediction framework that facilitates valid and efficient uncertainty quantification in a wide range of machine learning tasks. Discover more insights from 'Practical Guide to Applied Conformal Prediction in Python' by Valery Manokhin. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today! Read Here!⚡ Tech Tidbits: Stay Wired to the Latest Industry Buzz! AWS ML Made Easy 🌀 Enhance code review and approval efficiency with generative AI using Amazon Bedrock: This post discusses the challenges faced by managers in overseeing code review and approval processes in software development, such as lack of technical expertise, time constraints, volume of change requests, manual effort, and the need for documentation. It also introduces a solution that leverages generative artificial intelligence and integrates it with AWS deployment tools to streamline the review and approval process. The solution includes automated change analysis, summarization, and an approval workflow. Google Research 🌀 Cappy: Outperforming and boosting large multi-task language models with a small scorer. This blog discusses advancements in large language models (LLMs) and their use in natural language processing (NLP). It introduces the concept of multi-task LLMs, such as T0, FLAN, and OPT-IML, which excel at understanding and solving various tasks. It also presents a new approach called Cappy, a lightweight pre-trained scorer that enhances the performance and efficiency of multi-task LLMs. 🌀 Chain-of-table: Evolving tables in the reasoning chain for table understanding. This research focuses on improving how large language models (LLMs) reason over tabular data, which is challenging due to the structured nature of tables. The proposed framework, Chain-of-Table, trains LLMs to iteratively update tables, mimicking human reasoning, resulting in improved performance on table understanding tasks. 🌀 Talk like a graph: Encoding graphs for large language models. This research explores how to teach large language models (LLMs) to reason with graph information, crucial for understanding interconnected data. They introduce GraphQA, a benchmark to evaluate LLMs on graph problems, revealing insights into effective graph encoding methods and improving LLM performance on graph tasks by up to 60%. 🌀 Scalable Instructable Multiworld Agent (SIMA): A generalist AI agent for 3D virtual environments. Google DeepMind has developed SIMA, a versatile AI agent trained on multiple video games to follow natural-language instructions, akin to human behavior. Collaborating with game studios, SIMA navigates various environments, showcasing potential for AI to understand and execute diverse tasks. OpenAI Updates 🌀 Review completed & Altman, Brockman to continue to lead OpenAI: The OpenAI Board completed a review by WilmerHale, expressing full confidence in Sam Altman and Greg Brockman's leadership. They also elected new board members and adopted governance enhancements. WilmerHale's review found a breakdown in trust between the prior Board and Mr. Altman, leading to his removal, but concluded that his conduct did not mandate removal. Following the review, the Board endorsed the decision to rehire Mr. Altman and Mr. Brockman. 🌀 Global news partnerships: Le Monde and Prisa Media: OpenAI has partnered with Le Monde and Prisa Media to bring French and Spanish news content to ChatGPT. This partnership aims to enhance user interaction with news content and contribute to the training of OpenAI's models. Through these partnerships, users will access summaries and links to original articles, expanding their news consumption experience. This collaboration supports the news industry and its role in providing reliable information globally. Email Forwarded? Join DataPro Here!🔍 From Bits to BERT: Keeping Up with LLMs & GPTs 🌀 Introducing Devin, the first AI software engineer: Meet Devin, the autonomous AI software engineer, skilled in long-term reasoning and planning. Devin can learn new technologies, build and deploy apps, find and fix bugs, train AI models, and contribute to open source. Devin excels in resolving real-world GitHub issues, outperforming previous models. Cognition, the AI lab behind Devin, aims to unlock new possibilities beyond coding. 🌀 Google’s Croissant: a metadata format for ML-ready datasets. Croissant is a new metadata format for ML datasets, aiming to simplify the use of existing datasets for training ML models. It standardizes dataset descriptions and organization, supporting responsible AI practices. Croissant builds upon schema.org and is supported by major tools and repositories like Kaggle, Hugging Face, and OpenML. It includes a specification, example datasets, a Python library, and a visual editor to facilitate dataset usage and publication. 🌀 Kaist AI’s INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models. This research focuses on enhancing search accuracy by improving retrievers to understand users' intentions, similar to language models. It introduces INSTRUCTIR, a benchmark for evaluating retrievers' ability to follow user-aligned instructions in retrieval tasks. The study addresses limitations in existing benchmarks and highlights potential overfitting issues in instruction-aware retrieval datasets. 🌀 Sussex AI’s Spyx: A Library for Just-In-Time Compiled Optimization of Spiking Neural Networks. Advancements in large neural architectures have led to powerful AI accelerators for training deep neural networks. However, these networks often incur high costs. Neuromorphic computing with Spiking Neural Networks (SNNs) offers energy-efficient alternatives, but training SNNs is challenging. Spyx, a new lightweight SNN simulation and optimization library designed in JAX, aims to facilitate SNN architecture investigation by bridging Python-based deep learning frameworks with custom compute kernels, achieving optimal hardware utilization. 🌀 VMware’s SynCode: Improving LLM Code Generation with Grammar Augmentation. SynCode is a novel framework for efficient syntactical decoding of code with large language models (LLMs). It leverages grammar of a programming language using an offline-constructed efficient lookup table called Deterministic Finite Automaton (DFA) mask store. SynCode seamlessly integrates with any context-free grammar (CFG) defined language, reducing syntax errors by 96.07% when combined with LLMs. 🌀 Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference. Chatbot Arena is an open platform designed to evaluate Large Language Models (LLMs) by considering human preferences. Utilizing a pairwise comparison method and crowdsourced input, it assesses LLMs' alignment with user preferences. The platform, operational for months with over 240K votes, provides a credible and valuable resource for ranking LLMs. Check out the tool here. 🌀 Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. The project aims to develop medical Large Language Models (LLMs) in the six most spoken languages, benefiting 6.1 billion people. This includes creating the ApolloCorpora multilingual medical dataset and the XMedBench benchmark, with Apollo models achieving top performance among models of similar sizes. The project will open-source training data, code, model weights, and evaluation benchmarks. You can check for the demo here. ✨ On the Radar: Catch Up on What's Fresh🌀 Top Artificial Intelligence (AI) Tools That Can Generate Code To Help Programmers (2024): The article discusses how AI is changing programming, with tools like OpenAI Codex and GitHub Copilot generating code. It explores AI's impact on code quality and development speed, showcasing various AI-powered tools like Tabnine, CodeT5, and Polycoder. Additionally, it mentions AI tools for code review, static code analysis, and AI-assisted coding in IDEs like PyCharm and Visual Studio. 🌀 Pypi mirror in a private AWS environment Terraform: This article explains how to install Python packages in an AWS Sagemaker Studio environment without internet access. It covers setting up Sagemaker in VPC Only mode, using VPC Endpoint interfaces for network communications, and accessing the Pypi package repository through AWS Codeartifact, which allows defining Pypi as an upstream repository. 🌀 Custom pre-commit hooks for safer code changes: This blog post explains the importance of using pre-commit hooks in software development, particularly with the git version control system. It discusses the challenges of maintaining coding standards in collaborative projects and provides a step-by-step tutorial on how to set up and use custom pre-commit hooks for a Python project, using the example of validating dataflow definitions for the Hamilton library. 🌀 AQLM Quantization Algorithm, explained: A new quantization algorithm, AQLM (Additive Quantization of Language Models), was recently released and integrated into HuggingFace Transformers and HuggingFace PEFT. AQLM sets a new state-of-the-art for 2-bit quantization while providing improvements for 3-bit and 4-bit ranges, pushing the boundaries of model accuracy and memory footprint. 🌀 Revolutionize Web Browsing with AI: This article explores creating an AI agent using the gpt-4-vision-preview model from OpenAI, enabling it to navigate the web like a human. It discusses the agent's browser control, content browsing, and decision-making processes, showcasing potential use cases such as aiding visually challenged users and automating web browsing tasks. 🌀 Understanding Tensors: Learning a Data Structure Through 3 Pesky Errors. This article discusses transitioning from managing tabular data to working with tensors in TensorFlow, offering debugging tips and code recipes. It covers visualizing TensorFlow datasets, understanding tensor specs, and augmenting model summaries, while addressing common errors related to tensor rank and shape. 🌀 Running RStudio Inside a Container: This tutorial focuses on setting up RStudio using Docker, particularly leveraging the Rocker RStudio image. It covers pulling the image, launching RStudio in a container, and ensuring persistence of data by using volume mapping. The tutorial provides step-by-step instructions and explanations for each stage. 🌀 PyTorch and MLX for Apple Silicon: The blog discusses Apple's MLX framework, which is optimized for Apple Silicon and serves as a bridge between PyTorch, NumPy, and Jax. It details a comparison between MLX and PyTorch through a custom convolutional neural network implementation for image classification tasks. The discussion includes insights into MLX's features, such as its array class, lazy computation, and compilation for performance optimization. The post also highlights the ease of converting PyTorch code to MLX, despite some differences in API compatibility and coding conventions. See you next time!Affiliate Disclosure: This newsletter contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. This supports our work and helps us keep providing useful content. We only recommend products and services we think will benefit our readers. Thanks for your support!

0
0
752

article-image-chatgpt-for-data-governance

Jyoti Pathak

22 Mar 2024

11 min read

ChatGPT for Data Governance

Jyoti Pathak

22 Mar 2024

11 min read

0
0
7555

article-image-ai-distilled-39-unpacking-mistral-large-googles-gemini-challenges-and-copilot-enterprise

Kartikey Pandey

21 Mar 2024

9 min read

AI_Distilled #39: Unpacking Mistral Large, Google's Gemini Challenges, and Copilot Enterprise

Kartikey Pandey

21 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Print to Pixel: Optimize your learning experience with PacktSeveral research studies have proven that printed books enhance comprehension, with the tactile experience of flipping pages and annotating the margins adding depth to the learning experience. However, developers can't overlook the practical benefits of eBooks, such as quickly finding relevant information or carrying an entire library on a single device.Acknowledging the unique benefits of both formats, Packt is offering a 40% discount on all print books, plus a free eBook version of each purchase, from February 26th to February 29th.Here’s what’s included:A Vast Library: Enjoy 40% off on over 5,000 titles spanning topics from Cybersecurity to Generative AI.Complimentary eBook: Each print book purchase includes a free eBook.AI Assistant: Top 500 books come with a personalized AI that can simply complex topics to your learning style, offering an interactive learning experience.Start Building Your Tech Library Today!👋 Hello,“No Al is perfect, especially at this emerging stage of the industry’s development, but we know the bar is high for us and we will keep at it for however long it takes.”-Sundar Pichai, Google CEOPichai acknowledges problems with Gemini AI, stressing the importance of unbiased information for users, and outlining steps to address issues and improve products. A rapidly progressing industry, AI development is a tricky game to master, with numerous pitfalls along the way.Greetings readers! Our mission is to help you stay on top of the ever-changing AI landscape so you can advance your skills. Let’s get started with the latest news and developments across the AI field:Microsoft provides new LLM Mistral Large on Azure with Mistral AIGoogle accepts some responses from their Gemini were unacceptable and biasedGitHub has launched Copilot Enterprise coding assistant integrating throughout the software development processResearchers developed new optimized language models called MobileLLM for mobile devices with under a billion parametersResearchers at Microsoft have developed new techniques to improve visual language modelsWe’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials:Mastering the Art of Prompt CraftingBreaking Down How Large Language Models LearnUsing AI to Level Up Live GamesMonitoring Large Language Models on AWSLast but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Fine-Tuning Models for Speech Recognition Made SimpleMake Conversation Come Alive - Deploying Your Own AI Chat PartnerCombining Geospatial and Semantic Data to Build Powerful Search ToolsLeveraging Notion, Supabase and AI for Knowledge RetrievalWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers, Kartikey Pandey Editor-in-Chief, Packt Unleash Your Data Potential with Packt's Latest Titles and Platform Enhancements! In a world that's always changing, learning is key to success. At Packt, we've updated our learning platform to help you stay ahead in the fast-moving tech world. Our platform makes learning easier and more effective, helping you overcome challenges and achieve your goals. Boost Your Data Skills with Packt's DataPro Library: On-Demand Learning: Access a wide range of books, video courses, research papers, and articles to help you grow. AI Assistance: Get help from AI to understand complex concepts easily, all within the same learning environment.Personalized Dashboard: Enjoy a tailored learning experience with recommendations and insights just for you. Advanced Self-Assessment: Use the latest tools to identify what you need to learn and track your progress accurately. Vibrant Community: Join a community of data and AI enthusiasts on Discord for collaboration and knowledge sharing. Exclusive Access: Be part of the DataPro beta program for a chance to win Amazon gift cards and early access to new features. Value for Money: Get all these benefits for just $7.99 per month, a small investment for big gains in your careerEnhance Your Data Skills Today⚡ TechWave: AI/GPT News & AnalysisMicrosoft has partnered with Mistral AI to provide their new LLM Mistral Large on Azure cloud services. This state-of-the-art AI model offers advanced NLP capabilities. Several companies have praised Mistral Large's performance in increasing productivity and aiding innovation.Google's CEO recently said some responses from their AI model Gemini were unacceptable and biased. The company has been working to address these issues and sees improvements but will review what happened. They plan to relaunch Gemini in the coming weeks after fixing it.GitHub has launched Copilot Enterprise, an AI coding assistant that integrates throughout the software development process. It provides customized code suggestions based on an organization's codebase, answers questions about internal systems, and generates summaries of code changes. Early testing found massive productivity gains from such AI tools.Researchers have developed new optimized language models for mobile devices with under a billion parameters. Called MobileLLM, the models achieve higher accuracy than previous smaller models through innovative architecture and weight-sharing techniques. MobileLLM shows significant gains on conversation tasks and competes with much larger models for common on-device uses.Researchers at Microsoft have developed new techniques to improve visual language models using structured knowledge graphs. By incorporating relationship maps between image elements like objects and attributes, models can generate richer images from text descriptions. Hierarchical prompting and dual-path encoding methods were also introduced to help models better understand complex language.🌟 Secret Knowledge: AI/LLM Resources🌀 Mastering the Art of Prompt Crafting: Got a new NLP project that needs prompting? This guide covers the basics of effective prompt engineering for AI models like ChatGPT. Learn how clarity, conciseness, and context can improve responses. Also explore techniques like zero-shot learning and dynamic few shots, plus how temperature, top-p, and other settings can refine your model's "personality". From system messages to tailoring examples, these tips will help you leverage your LLMs' full potential.🌀 Breaking Down How Large Language Models Learn: This article provides a helpful breakdown of how LLMs are trained through causal language modeling and calculates loss. It visually explains how models generate text sequences, are pre-trained to predict the next token, and how cross-entropy loss compares predictions to true labels to update weights. The process is demonstrated through code showing how loss is manually calculated for an LLM matching the framework's automatic calculation. This gives developers valuable insights into how state-of-the-art models learn.🌀 Using AI to Level Up Live Games: This article discusses how generative AI can enhance live service games. Techniques like adaptive gameplay, personalized ads, and faster asset creation are described. The authors provide a framework for developing games using tools like Unity, GKE, and Vertex AI. They demonstrate how ML models can dynamically generate images, code and dialogue to customize the player experience. Whether deploying models on GKE or Vertex, cloud-based AI brings the benefits of lower costs and easier maintenance than self-hosted options. 🌀 Monitoring Large Language Models on AWS: As AI language models grow more advanced, ensuring they behave properly becomes more important. This article discusses techniques for monitoring LLMs deployed on AWS. Key metrics covered include semantic similarity of responses, sentiment analysis, refusal rates, and more. The proposed architecture takes in model outputs, runs metrics modules, and reports results to CloudWatch for aggregation and alerts. With the right monitoring in place, you can help keep your conversational AI acting as intended.🔛 Masterclass: AI/LLM Tutorials🌀 Fine-Tuning Models for Speech Recognition Made Simple: This article discusses how to fine-tune LLMs for automatic speech recognition tasks using Amazon SageMaker. It explains language models and ASR as well as the basic steps for fine-tuning a pre-trained model which includes preparing data, choosing a model, training, evaluating, and deploying. SageMaker is highlighted as a powerful yet easy-to-use platform for this process due to its scalability, integration with AWS services, and pay-as-you-go pricing.🌀 Make Conversation Come Alive - Deploying Your Own AI Chat Partner: Tired of boring chatbots? This guide shows you how to bring the amazing Qwen AI model to your own server so you can have engaging discussions on any topic. The steps cover setting up your environment, installing dependencies, initializing the tokenizer and model, and using history to keep conversations flowing naturally. Once complete, you'll have a powerful AI assistant right at your fingertips. Best of all, it's completely open source.🌀 Combining Geospatial and Semantic Data to Build Powerful Search Tools: This guide shows developers how to create an interactive campground search map using vector databases, NLP models, and geospatial data. Technologies like Qdrant, Llama2, and Streamlit allow embedding text and locations to enable semantic queries. The page explains setting up Qdrant cloud, loading campground CSV data, and parsing text into nodes. Developers can then embed nodes with HuggingFace and query the vector store to retrieve similar results. By leveraging tools that understand both spatial and semantic context, you can build customized applications to help users explore outdoor destinations.🌀 Leveraging Notion, Supabase, and AI for Knowledge Retrieval: This tutorial shows how you can build a knowledge base by extracting data from Notion databases and storing it in a vector format in Supabase. It then demonstrates retrieving relevant information from the knowledge base using an AI model from OpenAI. By combining these tools, developers can query custom datasets and generate responses based on retrieved documents. The process involves loading Notion documents, storing embeddings in Supabase, and setting up a retrieval pipeline. With some enhancements, this could be a powerful way to access organizational information.🚀 HackHub: Trending AI Tools🌀 lucky-lance/expert_sparsity: Implements efficient expert pruning and dynamic skipping techniques for mixture-of-experts large language models to improve their efficiency and speed while maintaining strong performance.🌀 facebookresearch/pearl: This open-source library provides a modular reinforcement learning framework for building and training production-ready AI agents, empowering developers with state-of-the-art techniques.🌀 zhen-tan-dmml/llm4annotation: Curates papers on using LLMs for data annotation, which developers could reference to apply these techniques or learn about the current state of the art.🌀 google/gemma.cpp: Provides a lightweight C++ library for running Google's Gemma models that developers can easily integrate into their own projects for experimenting with and deploying LLMs.

0
0
2571

article-image-get-started-with-fabric-create-your-workspace-reports

Arshad Ali, Bradley Schacht

14 Mar 2024

9 min read

Get Started with Fabric: Create Your Workspace & Reports

Arshad Ali, Bradley Schacht

14 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Learn Microsoft Fabric, by Arshad Ali, Bradley Schacht. Harness the power of Microsoft Fabric to develop data analytics solutions for various use cases guided by step-by-step instructionsIntroductionEmbark on a journey to harness the full potential of Microsoft Fabric within Power BI. This article serves as your comprehensive guide, walking you through the essential steps to create your first Fabric workspace seamlessly. From understanding the fundamentals to practical implementation, we'll equip you with the knowledge and tools needed to optimize your data management and reporting processes. Get ready to elevate your Power BI experience and unlock new possibilities with Fabric-enabled workspaces.Creating your first Fabric-enabled workspaceOnce you have confirmed that Fabric is enabled in your tenant and you have access to it, the next step is to create your Fabric workspace. You can think of a Fabric workspace as a logical container that will contain items such as lakehouses, warehouses, notebooks, and pipelines. Follow these steps to create your first Fabric workspace:1. Sign into Power BI (https://app.powerbi.com/).2. Select Workspaces | + New workspace:Figure 2.5 – Creating a new workspace3. Fill out the Create a workspace form, as follows:Name: Enter Learn Microsoft Fabric and some characters for uniqueness.Description: Optionally, enter a description for the workspace:Figure 2.6 – Create a workspace – detailsAdvanced: Select Fabric capacity under License mode and then choose a capacity you have access to. If not, you can start a trial license, as described earlier, and use it here.4. Select Apply. Th e workspace will be created and opened.5. You can click on Workspaces again and then search for your workspace by typing its name in the search box. You can also pin the selected workspace so that it always appears at the top:Figure 2.7 – Searching for a workspace6. Clicking on the name of the workspace will open that workspace. A link to it will become available in the left-hand side navigation bar, allowing you to switch from one item to another quickly. Since we haven’t created anything yet, there is nothing here. You can click on +New to start creating Fabric items:Figure 2.8 – Switching to a workspaceWith a Microsoft Fabric workspace set up, let’s review the different workloads that are available.Copilot in Power BIPower BI has several key components, including data transformation and data modeling, culminating in a visual report that end users will consume. The Copilot experience is centered around the visual storytelling and reporting aspects of Power BI. This materializes in three ways: report page creation, narrative generation, and improving Q&A.Let’s look at each of these Copilot capabilities.Creating reports with the Power BI CopilotThe most common use for Copilot with Power BI is likely to be for creating reports. There are two features that come together to build reports. The first analyzes the dataset to suggest content for your report by using table relationships and column names, while the second one helps you create intuitive reports quickly. Figure 11.30 shows an example where Copilot has suggested several report pages, each with a short description of what would be displayed:Figure 11.30 – The Power BI Copilot page suggestionsIf you like the page suggestions, simply click on the Create button and the report page will appear.While a suggested set of report content is a good starting point, analysts often have a specific need to meet. You can have Copilot create a report from the criteria you provide using prompts as well. These can be as simple as “create a page that shows customer analysis” or more specific, such as “create a page to show the impact of each sales territory on profit and quantity sold.”Figure 11.31 – Sales impact report created by CopilotOnce the report page is generated, Copilot cannot update the report, but you can interact with and modify the report as necessary. This is a great way to reduce the time to get started building reports.A couple of other important things to note are that in addition to not being able to modify reports, Copilot will not allow you to specify specific visual types, apply filters, or change the report layout. All of these can be changed manually after the initial report generation. It is worth noting that users should not expect Copilot to filter results to a specific time period based on their prompt as an example.Next, let’s look at the smart narrative.Creating a narrative using CopilotVisuals are a wonderful way to tell a story and give users the ability to explore data on their own. However, sometimes a narrative that summarizes what is being displayed in a report can be useful. It can not only tell a story but also provide some additional context and information for users.To get started, open a report and add a narrative visualization to the report as shown in Figure 11.32. You will see two options; click on Copilot. Choose the type of summary you wish to produce and optionally select specific pages or visuals to include in the summary. Then click on Create.Figure 11.32 – The report narrative generated by CopilotAfter the narrative is generated, remember to always review the narrative for accuracy and adjust the prompt, if necessary, to produce more accurate results. In addition to summaries, you can ask it to highlight key information, customize the order in which the data is described to help convey importance, specify specific data points to include in the summary, and even generate impact analysis showing how different factors affect metrics on the report.Report, page, and visual narratives are a great way to guide users through a report, especially if there isn’t a subject matter expert there to explain all the data.Finally, let’s look at using Copilot to improve the Q&A visual.Generating synonyms with CopilotThe Q&A visual has been dazzling users for years at this point. It is impressive to build a model, walk into the room, and tell users that they can use natural language to query their data without needing to build any visuals. This may not be as impressive as the Copilot functionality that we have today, but it is still a very useful tool in your Power BI visualization toolbelt.One piece of important information for the success of Q&A is something called a synonym. These are end-user-specific ways to reference data. For example, a table in the data model may be called Dim Person, but you know that some report consumers always refer to these as “users.” Therefore, you would create a synonym that tells Q&A that when someone asks about users, they are really talking about persons. This can also be done on a column level. A synonym for “postal code” could be “zip code,” while a synonym for an “item” could be “product” or “finished good.”Q&A itself may not use Copilot, but Power BI Desktop can leverage Copilot to generate synonyms. This can be done when creating a new Q&A visual by clicking on Add synonyms from the ribbon with the label Improve Q&A with synonyms from Copilot. They can also be generated from the Q&A settings menu by adding Copilot as a source from the Suggestion settings list.The more synonyms that can be used to describe your data, the more likely you are to produce quality Q&A results. It is important to double-check the synonyms generated by Copilot to ensure they line up with your specific business terminology.With these Copilot experiences for Power BI, you will be able to generate report ideas, report pages and visuals, summaries, and narratives, and improve Q&A.ConclusionIn conclusion, by mastering the creation of Fabric workspaces in Power BI, you've laid a solid foundation for efficient data management and reporting. With Fabric's capabilities at your fingertips, you're equipped to streamline workflows, generate insightful reports, and enhance collaboration within your organization. Keep exploring the diverse functionalities of Fabric to continuously refine your Power BI experience and stay ahead in the realm of data analytics.Author bioArshad Ali is a principal product manager at Microsoft, working on the Microsoft Fabric product team in Redmond, WA. He focuses on Spark Runtime, which empowers both data engineering and data science experiences. In his previous role, he helped strategic customers and partners adopt Azure Synapse and Microsoft Fabric.Arshad has more than 20 years of industry experience and has been with Microsoft for over 16 years. He is the co-author of the book Big Data Analytics with Azure HDInsight and the author of over 200 technical articles and blogs on data and analytics. Arshad holds an MBA from the Foster School of Business at the University of Washington and an MCA from India.Bradley Schacht is a principal program manager on the Microsoft Fabric product team based in Saint Augustine, Florida. Bradley is a former consultant and trainer and has co-authored five books on SQL Server and Power BI. As a member of the Microsoft Fabric product team, Bradley works directly with customers to solve some of their most complex data problems and helps shape the future of Microsoft Fabric. Bradley gives back to the community by speaking at events, such as the PASS Summit, SQL Saturday, Code Camp, and user groups across the country, including locally at the Jacksonville SQL Server User Group (JSSUG). He is a contributor on SQLServerCentral and blogs on his personal site, BradleySchacht.

0
0
672

article-image-enhancing-image-search-with-vector-similarity

Bahaaldine Azarmi, Jeff Vestal

12 Mar 2024

12 min read

Enhancing Image Search with Vector Similarity

Bahaaldine Azarmi, Jeff Vestal

12 Mar 2024

12 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Vector Search for Practitioners with Elastic, by Bahaaldine Azarmi and Jeff Vestal. Optimize your search capabilities in Elastic by operationalizing and fine-tuning vector search and enhance your search relevance while improving overall search performanceIntroductionVector similarity search plays a crucial role in image search. After images are transformed into vectors, a search query (also represented as a vector) is compared against the database of image vectors to find the most similar matches. This process is known as k-Nearest Neighbor (kNN) search, where “k” represents the number of similar items to retrieve.Several algorithms can be used for kNN search, including brute-force search and more efficient methods such as the Hierarchical Navigable Small World (HNSW) algorithm (see Chapter 7, Next Generation of Observability Powered, by Vectors for a more in-depth discussion on HNSW). Bruteforce search involves comparing the query vector with every vector in the database, which can be computationally expensive for large databases. On the other hand, HNSW is an optimized algorithm that can quickly find the nearest neighbors in a large-scale database, making it particularly useful for vector similarity search in image search systems.The tangible benefits of image search are observed across industries. Its flexibility and adaptability make it a tool of choice for enhancing user experiences, ensuring digital security, or even revolutionizing digital content interactions.Image search in practiceApplications of image search are varied and far-reaching. In e-commerce, for example, reverse image search allows customers to upload a photo of a product and find similar items for sale. In the field of digital forensics, image search can be used to find visually similar images across a database to detect illicit content. It is also used in the realm of social media for face recognition, image tagging, and content recommendation.As we continue to generate and share more visual content, the need for effective and efficient image search technology will only grow. The combination of artificial intelligence, machine learning, and vector similarity search provides a powerful toolkit to meet this demand, powering a new generation of image search capabilities that can analyze and understand visual content.Traditionally, image search engines use text-based metadata associated with images, such as the image’s filename, alt text, and surrounding text context, to understand the content of an image. This approach, however, is limited by the accuracy and completeness of the metadata, and it fails to analyze the actual visual content of the image itself.Over time, with advancements in artificial intelligence and machine learning, more sophisticated methods of image search have been developed that can analyze the visual content of images directly. This technique, known as content-based image retrieval (CBIR), involves extracting feature vectors from images and using these vectors to find visually similar images.Feature vectors are a numerical representation of an image’s visual content. They are generated by applying a feature extraction algorithm to the image. The specifics of the feature extraction process can vary, but in general, it involves analyzing the image’s colors, textures, and shapes. In recent years, CNNs have become a popular tool for feature extraction due to their ability to capture complex patterns in image data.Once feature vectors have been extracted from a set of images, these vectors can be indexed in a database. When a new query image is submitted, its feature vector is compared to the indexed vectors, and the images with the most similar vectors are returned as the search results. The similarity between vectors is typically measured using distance metrics such as Euclidean distance or cosine similarity.Despite the impressive capabilities of CBIR systems, there are several challenges in implementing them. For instance, interpreting and understanding the semantic meaning of images is a complex task due to the subjective nature of visual perception. Furthermore, the high dimensionality of image data can make the search process computationally expensive, particularly for large databases.To address these challenges, approximate nearest neighbor (ANN) search algorithms, such as the HNSW graph, are often used to optimize the search process. These algorithms sacrifice a small amount of accuracy for a significant increase in search speed, making them a practical choice for large-scale image search applications.With the advent of Elasticsearch’s dense vector field type, it is now possible to index and search highdimensional vectors directly within an Elasticsearch cluster. This functionality, combined with an appropriate feature extraction model, provides a powerful toolset for building efficient and scalable image search systems.In the following sections, we will delve into the details of image feature extraction, vector indexing, and search techniques. We will also demonstrate how to implement an image search system using Elasticsearch and a pre-trained CNN model for feature extraction. The overarching goal is to provide a comprehensive guide for building and optimizing image search systems using state-of-the-art technology.Vector search with imagesVector search is a transformative feature of Elasticsearch and other vector stores that enables a method for performing searches within complex data types such as images. Through this approach, images are converted into vectors that can be indexed, searched, and compared against each other, revolutionizing the way we can retrieve and analyze image data. This inherent characteristic of producing embeddings applies to other media types as well. This section provides an in-depth overview of the vector search process with images, including image vectorization, vector indexing in Elasticsearch, kNN search, vector similarity metrics, and fine-tuning the kNN algorithm.Image vectorizationThe first phase of the vector search process involves transforming the image data into a vector, a process known as image vectorization. Deep learning models, specifically CNNs, are typically employed for this task. CNNs are designed to understand and capture the intricate features of an image, such as color distribution, shapes, textures, and patterns. By processing an image through layers of convolutional, pooling, and fully connected nodes, a CNN can represent an image as a high-dimensional vector. This vector encapsulates the key features of the image, serving as its numerical representation.The output layer of a pre-trained CNN (often referred to as an embedding or feature vector) is often used for this purpose. Each dimension in this vector represents some learned feature from the image. For instance, one dimension might correspond to the presence of a particular color or texture pattern.The values in the vector quantify the extent to which these features are present in the image.Figure 1 : Layers of a CNN modelAs seen in the preceding diagram, these are the layers of a CNN model:1. Accepts raw pixel values of the image as input.2. Each layer extracts specific features such as edges, corners, textures, and so on.3. Introduces non-linearity, learns from errors, and approximates more complex functions.4. Reduces the dimensions of feature maps through down-sampling to decrease the computational complexity.5. Consists of the weights and biases from the previous layers for the classification process to take place.6. Outputs a probability distribution over classes.Indexing image vectors in ElasticsearchOnce the image vectors have been obtained, the next step is to index these vectors in Elasticsearch for future searching. Elasticsearch provides a special field type, the dense_vector field, to handle the storage of these high-dimensional vectors.A dense_vector field is defined as an array of numeric values, typically floating-point numbers, with a specified number of dimensions (dims). The maximum number of dimensions allowed for indexed vectors is currently 2,048, though this may be further increased in the future. It’s essential to note that each dense_vector field is single-valued, meaning that it is not possible to store multiple values in one such field.In the context of image search, each image (now represented as a vector) is indexed into an Elasticsearch document. This vector can be one per document or multiple vectors per document. The vector representing the image is stored in a dense_vector field within the document. Additionally, other relevant information or metadata about the image can be stored in other fields within the same document.The full example code can be found in the Jupyter Notebook available in the chapter 5 folder of this book’s GitHub repository at https://github.com/PacktPublishing/VectorSearch-for-Practitioners-with-Elastic/tree/main/chapter5, but we’ll discuss the relevant parts here.First, we will initialize a pre-trained model using the SentenceTransformer library.The clip-ViT-B-32-multilingual-v1 model is discussed in detail later in this chapter:model = SentenceTransformer('clip-ViT-B-32-multilingual-v1')Next, we will prepare the image transformation function:transform = transforms.Compose([ transforms.Resize(224), transforms.CenterCrop(224), lambda image: image.convert("RGB"), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), ])Transforms.Compose() combines all the following transformations:transforms.Resize(224): Resizes the shorter side of the image to 224 pixels while maintaining the aspect ratio.transforms.CenterCrop(224): Crops the center of the image so that the resultant image has dimensions of 224x224 pixels.lambda image: image.convert("RGB"): This is a transformation that converts the image to the RGB format. This is useful for grayscale images or images with an alpha channel, as deep learning models typically expect RGB inputs.transforms.ToTensor(): Converts the image (in the PIL image format) into a PyTorch tensor. This will change the data from a range of [0, 255] in the PIL image format to a float in a range [0.0, 1.0].transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)): Normalizes the tensor image with a given mean and standard deviation for each channel. In this case, the mean and standard deviation for all three channels (R, G, B) are 0.5. This normalization will transform the data range from [0.0, 1.0] to [-1.0, 1.0].We can use the following code to apply the transform to an image file and then generate an image vector using the model. See the Python notebook for this chapter to run against actual image files:from PIL import Image img = Image.open("image_file.jpg") image = transform(img).unsqueeze(0) image_vector = model.encode(image)The vector and other associated data can then be indexed into Elasticsearch for use with kNN search:# Create document document = {'_index': index_name, '_source': {"filename": filename, "image_vector": vector See the complete code in the chapter 5 folder of this book’s GitHub repository.With vectors generated and indexed into Elasticsearch, we can move on to searching for similar images.k-Nearest Neighbor (kNN) searchWith the vectors now indexed in Elasticsearch, the next step is to make use of kNN search. You can refer back to Chapter 2, Getting Started with Vector Search in Elastic, for a full discussion on kNN and HNSW search.As with text-based vector search, when performing vector search with images, we first need to convert our query image to a vector. The process is the same as we used to convert images to vectors at index time.We convert the image to a vector and include that vector in the query_vector parameter of the knn search function:knn = { "field": "image_vector", "query_vector": search_image_vector[0], "k": 1, "num_candidates": 10 }Here, we specify the following:field: The field in the index that contains vector representations of images we are searching againstquery_vector: The vector representation of our query imagek: We want only one closest imagenum_candidates: The number of approximate nearest neighbor candidates on each shard to search againstWith an understanding of how to convert an image to a vector representation and perform an approximate nearest neighbor search, let’s discuss some of the challenges.Challenges and limitations with image searchWhile vector search with images offers powerful capabilities for image retrieval, it also comes with certain challenges and limitations. One of the main challenges is the high dimensionality of image vectors, which can lead to computational inefficiencies and difficulties in visualizing and interpreting the data.Additionally, while pre-trained models for feature extraction can capture a wide range of features, they may not always align with the specific features that are relevant to a particular use case. This can lead to suboptimal search results. One potential solution, not limited to image search, is to use transfer learning to fine-tune the feature extraction model on a specific task, although this requires additional data and computational resources.ConclusionIn conclusion, vector similarity search revolutionizes image retrieval by harnessing advanced algorithms and machine learning. From e-commerce to digital forensics, its impact is profound, enhancing user experiences and content discovery. Leveraging techniques like k-Nearest Neighbor search and Elasticsearch's dense vector field, image search becomes more efficient and scalable. Despite challenges, such as high dimensionality and feature alignment, ongoing advancements promise even greater insights into visual data. As technology evolves, so does our ability to navigate and understand the vast landscape of images, ensuring a future of enhanced digital interactions and insights.Author BioBahaaldine Azarmi, Global VP Customer Engineering at Elastic, guides companies as they leverage data architecture, distributed systems, machine learning, and generative AI. He leads the customer engineering team, focusing on cloud consumption, and is passionate about sharing knowledge to build and inspire a community skilled in AI.Jeff Vestal has a rich background spanning over a decade in financial trading firms and extensive experience with Elasticsearch. He offers a unique blend of operational acumen, engineering skills, and machine learning expertise. As a Principal Customer Enterprise Architect, he excels at crafting innovative solutions, leveraging Elasticsearch's advanced search capabilities, machine learning features, and generative AI integrations, adeptly guiding users to transform complex data challenges into actionable insights.

0
0
4546

article-image-using-chatgpt-for-customer-service

Amita Kapoor

07 Mar 2024

10 min read

Using ChatGPT for Customer Service

Amita Kapoor

07 Mar 2024

10 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionCustomer service bots of old can often feel robotic, rigid, and painfully predictable. But enter ChatGPT: the fresher, more dynamic contender in the bot arena.ChatGPT isn't just another bot. It's been meticulously trained on a vast sea of text and code, equipping it to grapple with questions that would stump its predecessors. And it's not limited to just customer queries; this versatile bot can craft a range of text formats, from poems to programming snippets.But the standout feature? ChatGPT's touch of humour. It's not just about answering questions; it's about engaging in a way that's both informative and entertaining. So if you're in search of a customer service experience that's more captivating than the norm, it might be time to chat with ChatGPT. Onboarding ChatGPT: A Quick and Easy GuideReady to set sail with ChatGPT? Here's your easy guide to make sure you're all set and ready to roll:1. Obtain the API Key: First, you'll need to get an API key from OpenAI. This is like your secret password to the world of ChatGPT. To get an API key, head to the OpenAI platform and sign up. Once you're signed in, go to the API section and click on "Create New Key."2. Integrate ChatGPT with Your System: Once you have your API key, you can integrate ChatGPT with your system. This is like introducing ChatGPT to your system and making sure they're friends, ready to work together smoothly. To integrate ChatGPT, you'll need to add your API key into your system's code. The specific steps involved will vary depending on your system, but there are many resources available online to help you. Here is an example of how you can do it in Python:import openai import os # Initialize OpenAI API Client api_key = os.environ.get("OPENAI_API_KEY") # Retrieve the API key from environment variables openai.api_key = api_key # Set the API key # API parameters model = "gpt-3.5-turbo" # Choose the appropriate engine max_tokens = 150 # Limit the response length3. Fine-Tune ChatGPT (Optional): ChatGPT is super smart, but sometimes you might need it to learn some specific stuff about your company. That's where fine-tuning comes in. To fine-tune ChatGPT, you can provide it with training data that is specific to your company. This could include product information, customer service FAQs, or even just examples of the types of conversations that you want ChatGPT to be able to handle. Fine-tuning is not required, but it can help to improve the performance of ChatGPT on your specific tasks. [https://www.packtpub.com/article-hub/fine-tuning-gpt-35-and-4].And that's it! With these three steps, ChatGPT will be all set to jump in and take your customer service to the next level. Ready, set, ChatGPT!Utilise ChatGPT for Seamless Question AnsweringIn the ever-evolving world of customer service, stand out by integrating ChatGPT into your service channels, making real-time, accurate response a seamless experience for your customers. Let’s delve into an example to understand the process better.Example: EdTech Site with Online K-12 CoursesImagine operating a customer service bot for an EdTech site with online courses for K-12. You want to ensure that the bot provides answers only on relevant questions, enhancing the user experience and ensuring the accuracy and efficiency of responses. Here's how you can achieve this:1. Pre-defined Context:Initiate the conversation with a system message that sets the context for the bot’s role.role_gpt = "You are a customer service assistant for an EdTech site that offers online K-12 courses. Provide information and assistance regarding the courses, enrollment, and related queries." This directive helps guide the model's responses, ensuring they align with the expected topics.2. Keyword Filtering:Implement keyword filtering to review user’s queries for relevance to topics the bot handles. If the query includes keywords related to courses, enrollment, etc., the bot answers; otherwise, it informs the user about the limitation. Here's a basic example of a keyword filtering function in Python. This function is_relevant_query checks if the query contains certain keywords related to the services offered by the EdTech site.def is_relevant_query(query, keywords): """ Check if the query contains any of the specified keywords. :param query: str, the user's query :param keywords: list of str, keywords to check for :return: bool, True if query contains any keyword, False otherwise """ query = query.lower() return any(keyword in query for keyword in keywords) # Usage example: keywords = ['enrollment', 'courses', 'k-12', 'online learning'] query = "Tell me about the enrollment process." is_relevant = is_relevant_query(query, keywords)Next, we combine the bot role and user query to build the complete messagemessages = [ { "role": "system", "content": f"{role_gpt}" }, {"role": "user", "content": f"{query}"} ]We now make the openAI API can only when the question is relevant:is_relevant = is_relevant_query(query, keywords) if is_relevant: # Process the query with ChatGPT # Make API call response = openai.ChatCompletion.create( model=model, messages=messages ) # Extract and print chatbot's reply chatbot_reply = response['choices'][0]['message']['content' print("ChatGPT: ", chatbot_reply) else: print("I'm sorry, I can only answer questions related to enrollment, courses, and online learning for K-12.")To elevate the user experience, prompt your customers to use specific questions. This subtle guidance helps funnel their queries, ensuring they stay on-topic and receive the most relevant information quickly. Continuous observation of user interactions and consistent collection of their feedback is paramount. This valuable insight allows you to refine your bot, making it more intuitive and adept at handling various questions. Further enhancing the bot's efficiency, enable a feature where it can politely ask for clarification on vague or ambiguous inquiries. This ensures your bot continues to provide precise and relevant answers, solidifying its role as an invaluable resource for your customers.Utilise ChatGPT to tackle Frequently Asked QuestionsAmidst the myriad of queries in customer service, frequently asked questions (FAQs) create a pattern. With ChatGPT, transform the typical, monotonous FAQ experience into an engaging and efficient one.Example: A Hospital ChatbotConsider the scenario of a hospital chatbot. Patients might have numerous questions before and after appointments. They might be inquiring about the hospital’s visitor policies, appointment scheduling, post-consultation care, or the availability of specialists. A well-implemented ChatGPT can swiftly and accurately tackle these questions, giving relief to both the hospital staff and the patients. Here is a tentative role setting for such a bot:role_gpt = "You are a friendly assistant for a hospital, guiding users with appointment scheduling, hospital policies, and post-consultation care."This orientation anchors the bot within the healthcare context, offering relevant and timely patient information. For optimal results, a finely tuned ChatGPT model for this use case is ideal. This enhancement allows for precise, context-aware processing of healthcare-related queries, ensuring your chatbot stands as a trustworthy, efficient resource for patient inquiries.The approach outlined above can be seamlessly adapted to various other sectors. Imagine a travel agency, where customers frequently inquire about trip details, booking procedures, and cancellation policies. Or consider a retail setting, where questions about product availability, return policies, and shipping details abound. Universities can employ ChatGPT to assist students and parents with admission queries, course details, and campus information. Even local government offices can utilize ChatGPT to provide citizens with instant information about public services, documentation procedures, and local regulations. In each scenario, a tailored ChatGPT, possibly fine-tuned for the specific industry, can provide swift, clear, and accurate responses, elevating the customer experience and allowing human staff to focus on more complex tasks. The possibilities are boundless, underscoring the transformative potential of integrating ChatGPT in customer service across diverse sectors. Adventures in AI Land🐙 Octopus Energy: Hailing from the UK's bustling lanes, Octopus Energy unleashed ChatGPT into the wild world of customer inquiries. Lo and behold, handling nearly half of all questions, ChatGPT isn’t just holding the fort – it’s conquering, earning accolades and outshining its human allies in ratings!📘 Chegg: Fear not, night-owl students! The world of academia isn’t left behind in the AI revolution. Chegg, armed with the mighty ChatGPT (aka Cheggmate), stands as the valiant knight ready to battle those brain-teasing queries when the world sleeps at 2 AM. Say goodbye to the midnight oil blues!🥤 PepsiCo: Oh, the fizz and dazzle! The giants aren’t just watching from the sidelines. PepsiCo, joining forces with Bain & Company, bestowed upon ChatGPT the quill to script their advertisements. Now every pop and fizz of their beverages echo with the whispers of AI, making each gulp a symphony of allure and refreshment.Ethical Considerations for Customer Service ChatGPTIn the journey of enhancing customer service with ChatGPT, companies should hold the compass of ethical considerations steadfast. Navigate through the AI world with a responsible map that ensures not just efficiency and innovation but also the upholding of ethical standards. Below are the vital checkpoints to ensure the ethical use of ChatGPT in customer service:Transparency: Uphold honesty by ensuring customers know they are interacting with a machine learning model. This clarity builds a foundation of trust and sets the right expectations.Data Privacy: Safeguard customer data with robust security measures, ensuring protection against unauthorized access and adherence to relevant data protection regulations. For further analysis or training, use anonymized data, safeguarding customer identity and sensitive information.Accountability: Keep a watchful eye on AI interactions, ensuring the responses are accurate, relevant, and appropriate. Establish a system for accountability and continuous improvement.Legal Compliance: Keep the use of AI in customer service within the bounds of relevant laws and regulations, ensuring compliance with AI, data protection, and customer rights laws.User Autonomy: Ensure customers have the choice to switch to a human representative, maintaining their comfort and ensuring their queries are comprehensively addressed.TConclusionTo Wrap it Up (with a Bow), if you're all about leveling up your customer service game, ChatGPT's your partner-in-crime. But like any good tool, it's all about how you wield it. So, gear up, fine-tune, and dive into this AI adventure!Author BioAmita Kapoor is an accomplished AI consultant and educator with over 25 years of experience. She has received international recognition for her work, including the DAAD fellowship and the Intel Developer Mesh AI Innovator Award. She is a highly respected scholar with over 100 research papers and several best-selling books on deep learning and AI. After teaching for 25 years at the University of Delhi, Amita retired early and turned her focus to democratizing AI education. She currently serves as a member of the Board of Directors for the non-profit Neuromatch Academy, fostering greater accessibility to knowledge and resources in the field. After her retirement, Amita founded NePeur, a company providing data analytics and AI consultancy services. In addition, she shares her expertise with a global audience by teaching online classes on data science and AI at the University of Oxford.

0
0
1639

article-image-streamlining-insights-with-microsoft-copilot

Gus Frazer

04 Mar 2024

9 min read

Streamlining Insights with Microsoft Copilot

Gus Frazer

04 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Data Cleaning with Power BI, by Gus Frazer. Unlock the full potential of your data by mastering the art of cleaning, preparing, and transforming data with Power BI for smarter insights and data visualizationsIntroductionFor those who have never heard of Microsoft Copilot, it is a new technology that Microsoft has released across a number of its platforms that combines generative AI with your data to enhance productivity. Copilot for Power BI harnesses cutting-edge generative AI alongside your dataset, revolutionizing the process of uncovering and disseminating insights with unprecedented speed. Seamlessly integrated into your workflow, Copilot offers an array of functionalities aimed at streamlining your reporting experience.When it comes to report creation, Copilot streamlines the process by allowing users to effortlessly generate reports by articulating the insights they seek or posing questions regarding their dataset using NLP. Copilot then analyzes the data, pulling together relevant information to craft visually striking reports, thereby transforming raw data into actionable insights instantaneously. Moreover, Copilot has the ability to read your data and suggest the best position to begin your analysis, which can then be tailored to suit the direction you want to take the analysis in.This is great, but how can it help you clean and prepare data for analysis? Well, Copilot can be leveraged on multiple data tools from within the Microsoft Fabric platform. For those who are not aware, Power BI has now become part of the Fabric platform. Depending on what type of license you have for Power BI, you might already have access to this. Any customers with Premium capacity licensing for the Power BI service would have automatically been given access to Microsoft Fabric, and more importantly, Copilot.That being said, currently, Copilot has only been made available to customers with a P1 (or above) Premium capacity or a Fabric license of F64 (or above), which is the equivalent licensing available directly from the Azure portal.If you would like to follow along with the next example, you will need to set up a Fabric capacity within your Azure portal. Don’t worry, you can pause this service when it’s not being used to ensure you are only charged for the time you’re using it. Alternatively, follow the steps to see the outcome:1. Log in to the Azure portal that you set up in the previous section of this chapter.2. Select the search bar at the top of the page and type in Microsoft Fabric. Select the service in the menu that appears below the search bar, which should take you to the page where you can manage your capacities.3. Select Create a Fabric capacity. Note that you will need to use an organizational account in order to create a Fabric capacity as opposed to a personal account. You can sign up for a Microsoft Fabric trial for your organization within the window. Further details on how to do this are provided here: https://learn.microsoft.com/en-us/power-bi/ enterprise/service-admin-signing-up-for-power-bi-with-a-newoffice-365-trial.4. Select the subscription and resource group you would like to use for this Fabric capacity.5. Then, under capacity details, you can enter your capacity name. In this example, you can call it cleaningdata.6. The Region field should populate with the region of your tenant, but you can change this if you like. However, this may have implications on performance, which it should warn you about with a message.7. Set the capacity to F64.8. Then, click on select Review + create.9. Review the terms and then click on Create, which will begin the deployment of your capacity.10. Once deployed, select Go to resource to view your Fabric capacity. Take note that this will be active once deployed. Make sure to return here aft er testing to pause or delete your Fabric capacity to prevent yourself from getting charged for this service.Now you will need to ensure you have activated the Copilot settings from within your Fabric capacity. To do this, go to https://app.powerbi.com/admin-portal/ to log in and access the admin portal.Important tipIf you can’t see the Tenant settings tab, then you will need to ensure you have been set up as an admin within your Microsoft 365 admin center. If you have just created a new account, then you will need to set this up. Follow the next links to assign roles:• https://learn.microsoft.com/en-us/microsoft-365/admin/addusers/assign-admin-roles• https://learn.microsoft.com/en-us/fabric/admin/microsoftfabric-admin11. Scroll to the bottom of Tenant settings until you see the Copilot and Azure OpenAI service (preview) section as shown:Figure – The tenant settings from within Power BI12. Ensure both settings are set to Enabled and then click on Apply.Now that you have created your Fabric capacity, let’s jump into an example of how we can use Copilot to help with the cleaning of data. As we have created a new capacity, you will have to create a new workspace that uses this new capacity:1. Navigate back to Workspaces using the left navigation bar. Then, select New Workspace.2. Name your workspace CleaningData(Copilot), then select the dropdown for advanced configuration settings.3. Ensure you have selected Fabric capacity in the license mode, which in turn will have selected your capacity below, and then select Apply. You have now created your capacity!4. Now let’s use Fabric to create a new dataflow using the latest update of Datafl ow Gen2. Select New from within the workspace and then select More options.5. This will navigate you to a page with all the possible actions to create items within your Fabric workspace. Under Data Factory, select Datafl ow Gen2.6. This will load a Datafl ow Gen2 instance called Datafl ow 1. On the top row, you should now see the Copilot logo within the Home ribbon as highlighted:Figure – The ribbon within a Dataflow Gen2 instance7. Select Copilot to open the Copilot window on the right-hand side of the page. As you have not connected to any data, it will prompt you to select get data.8. Select Text/CSV and then enter the following into the File path or URL box:https://raw.githubusercontent.com/PacktPublishing/Data-Cleaningwith-Power-BI/main/Retail%20Store%20Sales%20Data.csv9. Leave the rest of the settings as their defaults and click on Next.10. This will then open a preview of the file data. Click on Create to load this data into your Datafl ow Gen2 instance. You will see that the Copilot window will have now changed to prompt you as to what you would like to do (if it hasn’t, then simply close the Copilot window and reopen):Figure – Data loaded into Dataflow Gen211. In this example, we can see that the data includes a column called Order Date but we don’t have a fi eld for the fi scal year. Enter the following prompt to ask Copilot to help with the transformation:There's a column in the data named Order Date, which shows when an order was placed. However, I need to create a new column from this that shows the Fiscal Year. Can you extract the year from the date and call this Fiscal Year? Set this new column to type number also.12. Proceed using the arrow key or press Enter. Copilot will then begin working on your request. As you will see in the resulting output, the model has added a function (or step) called Custom to the query that we had selected.13. Scroll to the far side and you will see that this has added a new column called Fiscal Year.14. Now add the following prompt to narrow down our data and press Enter:Can you now remove all columns leaving me with just Order ID, Order Date, Fiscal year, category, and Sales?15. This will then add another function or step called Choose columns. Finally, add the following prompt to aggregate this data and press Enter:Can you now group this data by Category, Fiscal year, and aggregated by Sum of Sales?As you can see, Copilot has now added another function called Custom 1 to the applied steps in this query, resulting in this table:Figure – The results from asking Copilot to transform the dataTo view the M query that Copilot has added, select Advanced editor, which will show the functions that Copilot has added for you:Figure – The resulting M query created by Copilot to carry out the request transformations to clean the dataIn this example, you explored the new technologies available with Copilot and how they help to transform the data using tools such as Datafl ow Gen2.While it’s great to understand the amazing possibilities AI brings to data, it’s also crucially important that you understand the challenges it presents.ConclusionIn conclusion, Microsoft Copilot offers a groundbreaking approach to enhancing productivity and efficiency in data analysis and report generation within Power BI. By seamlessly integrating generative AI technology, Copilot revolutionizes the way insights are discovered and data is prepared, providing users with unprecedented speed and accuracy. Whether streamlining report creation or optimizing data management tasks, Copilot empowers users to unlock the full potential of their data, paving the way for more informed decision-making and actionable insights.Author BioGus Frazer is a seasoned Analytics Consultant focused on Business Intelligence solutions. With over 7 years of experience working for the two market-leading platforms, Power BI & Tableau, has amassed a wealth of knowledge and expertise. Gus has helped hundreds of customers to drive their digital and data transformations, scope data requirements, drive actionable insights, and most important of all, cleanse data ready for analysis. Most recently helping to set up, organize and run the Power BI UK community at Microsoft. He holds 6 Azure and Power BI certifications, including the PL-300 and DP-500 certifications. In this book, Gus offers readers invaluable guidance on ingesting, preparing, and cleansing data for analysis in Power BI. --This text refers to an out of print or unavailable edition of this title.

0
0
1567

article-image-ai-distilled-38-latest-in-ai-sora-gemini-15-and-more

Merlyn Shelley

01 Mar 2024

9 min read

AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Merlyn Shelley

01 Mar 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,“People say AI is overhyped, but I think it's not hyped enough. The next generation who will use this in the next few years will have a much higher bar on what technology can do for them. So how you build it for that generation, how you build it for that future will be really interesting to see.”-Puneet Chandok, Microsoft India and South Asia presidentSpeaking at a panel discussion on AI at the Mumbai Tech Week, Chandok believes AI is not hyped enough considering its potential for disruptive transformation. He encourages more training on AI to realize its full potential.Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:OpenAI unveils Sora, an AI model generating videos from textGoogle's latest conversational AI model Gemini 1.5 has a million-token context windowNew AI news reader app tackles clickbait headlines, provides summariesSlack is rolling out new AI features for enterprise users including thread summariesLangChain announced raising $25 million to launch new platform for building LLM appsAI helps improve medical imaging to benefit patients globallyResearchers develop AI model that determines a person's sex from brain scansWe’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:Giving AI Models a Better Memory: How Google DeepMind Expanded Context WindowsAdvanced Techniques For More Relevant AI ResponsesReinforcement Learning ExplainedBridging the Gap Between AI and App DevelopmentFinally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:Creating Custom Models Without the Hassle of Data CollectionCode Your Own AI Coding BuddyEvaluating Code Quality with AI AssistantsEasily Deploy Language Models LocallyLooking for some inspiration? Here are some GitHub repositories to get your projects going!gptscript-ai/gptscriptkarpathy/minbpeAAAI-DISIM-UnivAQ/DALIQwenLM/QwenWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers, Kartikey Pandey Editor-in-Chief, Packt ⚡ TechWave: AI/GPT News & AnalysisOpenAI unveiled Sora, an AI model generating videos from text at up to a minute in length. Sora demonstrates an understanding of language and the physical world and photorealism across styles, though human subjects appear game-like.Google's latest conversational AI model Gemini 1.5 analyzes more information than before, thanks to a million-token context window. This allows for summarizing the Apollo 11 mission transcript or analyzing a 44-minute silent film in full. Early results show the system maintains performance as context grows into the millions.Bulletin, a new AI-powered news reader app, tackles clickbait headlines and provides summaries of news articles with customizable news sources.Slack is rolling out new AI features for enterprise users including thread summaries, channel recaps, and answering workplace questions. The tools provide highlights from missed messages and help catch up.LangChain announced raising $25 million to launch their new platform LangSmith for building and monitoring LLM apps. LangSmith allows developers to accelerate workflows across development, testing, deployment, and monitoring. It has already seen significant adoption with over 70,000 signups and 5000 monthly active companies.Courtesy: Bulletin/Shihab MehboobAI is helping improve medical imaging to benefit patients globally. ML can quickly analyze large datasets to find issues doctors may miss and flag urgent cases. Cloud solutions also enable sharing scans and remote expert assistance anywhere. Companies are applying these methods to speed diagnoses, reduce wait times, and bring ultrasounds directly to homes. Researchers have also developed an AI model that can determine a person's sex from brain scans with over 90% accuracy. The model analyzed dynamic MRI scans and identified the default mode, striatum, and limbic networks as key in distinguishing male and female brains. This breakthrough furthers our understanding of brain organization and could help address sex-specific health issues. 🔮 Expert Insights from Packt Community Generative AI with LangChain - By Dr. Ben AuffarthChatGPT and the GPT models by OpenAI have brought about a revolution not only in how we write and research but also in how we can process information.This book discusses the functioning, capabilities, and limitations of LLMs underlying chat systems, including ChatGPT and Bard. It also demonstrates, in a series of practical examples, how to use the LangChain framework to build production-ready and responsive LLM applications for tasks ranging from customer support to software development assistance and data analysis Key TakeawaysExplore the expansive utility of LLMs in real-world applications.Guidance on fine-tuning, prompt engineering, and best practices.Learn how to use the LangChain framework to build production-ready LLM applications.By the end of this book, you'll be equipped with the practical knowledge and skills to leverage the transformative power of generative AI with confidence and creativity.Read More🌟 Secret Knowledge: AI/LLM Resources🌀 Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows: Google DeepMind's latest AI model Gemini 1.5 has significantly improved how much information it can process at once, thanks to advances in "long context windows." The team discovered their model could understand over 1 million pieces of information in a single sitting, far surpassing earlier limits. This opens up new possibilities for tasks like summarizing lengthy documents, analyzing large codebases, and even comprehending full movies. Developers are excited to explore creative uses of this expanded recall.🌀 Advanced Techniques For More Relevant AI Responses: This article discusses how to improve AI conversation models like RAG by enhancing how information is stored, found and used. Methods covered include indexing sentences individually while keeping their surrounding context, combining keyword search with semantic search, and re-scoring results based on the question. The author demonstrates implementing these "advanced RAG" techniques in Python using tools like LlamaIndex and Weaviate. With these optimizations, AI systems can provide more helpful responses by accessing knowledge in a targeted manner.🌀 Reinforcement Learning Explained: This article breaks down the key concepts of reinforcement learning in an easy-to-understand way. It covers states, actions, rewards, and how agents interact with environments to learn policies. RL agents try different strategies to maximize long-term rewards through trial and error. Episodes provide a framework to evaluate policies. Deterministic policies pick set actions while stochastic policies use probabilities. Whether you're new to RL or a veteran, this primer is worth a read to get acquainted with the basics.🌀 Bridging the Gap Between AI and App Development: As AI becomes more advanced, developers need easier ways to integrate cutting-edge features into their work. However, directly using AI code frameworks can be challenging and limit scalability. The solution? AI gateways. By handling tasks like routing, caching, and monitoring behind the scenes, gateways act as a bridge between complex AI systems and traditional development workflows. They streamline the integration process while ensuring high performance. Are gateways the future of intelligent applications?Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a try!🔛 Masterclass: AI/LLM Tutorials🌀 Creating Custom Models Without the Hassle of Data Collection: Tired of spending big bucks to use proprietary AI APIs or going through the tedious process of collecting your training data? This page shows how you can train customized models more efficiently. By using an open-source LLM to generate synthetic annotations for a small sample of your data, you can then fine-tune a smaller model tailored exactly to your needs. The process takes just a few steps and allows you to analyze large datasets for a fraction of the cost. Best of all, you avoid sending sensitive data to third parties.🌀 Code Your Own AI Coding Buddy: This guide shows you how to build an AI assistant that lives right on your computer. Using tools like HuggingFace and Streamlit, you can create a chatbot trained on Code Llama. Simply ask it questions and it will respond with examples in languages like Python, Java, and C++. Better yet, the models are free and open-source. This is a neural net sidekick to help automate repetitive tasks and speed up your workflow.🌀 Evaluating Code Quality with AI Assistants: This article explores using AI to improve code quality by testing Python scripts with SonarQube and getting feedback from LLMs. The author ran tests on ChatGPT and open-source models like Code Llama to see if they could identify issues flagged by SonarQube. While the models struggled to pinpoint errors solely from descriptions, some provided insightful summaries. Continued development of coding-focused LLMs may help automate part of the review process.🌀 Easily Deploy Language Models Locally: With a simple four-step process, you can get powerful language models like ChatGPT running on your hardware. First, choose a model from HuggingFace and quantize it for faster performance. Then build an Ollama image to serve the model. For a slick interface, deploy a ChatGPT-style React app talking to Ollama via Docker. The whole setup only takes around 15 minutes. Now you've got a custom language assistant without internet dependence.🚀 HackHub: Trending AI Tools🌀 gptscript-ai/gptscript: Open source NLP tool that allows developers to automate tasks by writing scripts in plain English.🌀 karpathy/minbpe: Minimal and clean Python code for the byte pair encoding algorithm commonly used in NLP and language model tokenization.🌀 AAAI-DISIM-UnivAQ/DALI: Framework allowing developers to build multi-agent systems in Prolog for applications like robotics, event processing, and more.🌀 QwenLM/Qwen: Open source code, models, and documentation for the Qwen series of LLMs, including Qwen, Qwen-Chat, and their various sizes.

0
0
2701

article-image-getting-started-with-microsoft-guidance

Prakhar Mishra

28 Feb 2024

8 min read

Getting Started with Microsoft Guidance

Prakhar Mishra

28 Feb 2024

8 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionThe emergence of a massive language model is a watershed moment in the field of artificial intelligence (AI) and natural language processing (NLP). Because of their extraordinary capacity to write human-like text and perform a range of language-related tasks, these models, which are based on deep learning techniques, have earned considerable interest and acceptance. This field has undergone significant scientific developments in recent years. Researchers all over the world have been developing better and more domain-specific LLMs to meet the needs of various use cases.Large Language Models (LLMs) such as GPT-3 and its descendants, like any technology or strategy, have downsides and limits. And, in order to use LLMs properly, ethically, and to their maximum capacity, it is critical to grasp their downsides and limitations. Unlike large language models such as GPT-4, which can follow the majority of commands. Language models that are not equivalently large enough (such as GPT-2, LLaMa, and its derivatives) frequently suffer from the difficulty of not following instructions adequately, particularly the part of instruction that asks for generating output in a specific structure. This causes a bottleneck when constructing a pipeline in which the output of LLMs is fed to other downstream functions.Introducing Guidance - an effective and efficient means of controlling modern language models compared to conventional prompting methods. It supports both open (LLaMa, GPT-2, Alpaca, and so on) and closed LLMs (ChatGPT, GPT-4, and so on). It can be considered as a part of a larger ecosystem of tools for expanding the capabilities of language models.Guidance uses Handlebars - a templating language. Handlebars allow us to build semantic templates effectively by compiling templates into JavaScript functions. Making it’s execution faster than other templating engines. Guidance also integrates well with Jsonformer - a bulletproof way to generate structured JSON from language models. Here’s a detailed notebook on the same. Also, in case you were to use OpenAI from Azure AI then Guidance has you covered - notebook.Moving on to some of the outstanding features that Guidance offers. Feel free to check out the entire list of features.Features1. Guidance Acceleration - This addition significantly improves inference performance by efficiently utilizing the Key/Value caches as we proceed through the prompt by keeping a session state with the LLM inference. Benchmarking revealed a 50% reduction in runtime when compared to standard prompting approaches. Here’s the link to one of the benchmarking exercises. The below image shows an example of generating a character profile of an RPG game in JSON format. The green highlights are the generations done by the model, whereas the blue and no highlights are the ones that are copied as it is from the input prompt, unlike the traditional method that tries to generate every bit of it.SourceNote: As of now, the Guidance Acceleration feature is implemented for open LLMs. We can soon expect to see if working with closed LLMs as well.2. Token Healing - This feature attempts to correct tokenization artifacts that commonly occur at the border between the end of a prompt and the start of a group of generated tokens.For example - If we ask LLM to auto-complete a URL with the below-mentioned Input, it’s likely to produce the shown output. Apart from the obvious limitation that the URL might not be valid. I'd like to draw your attention to the extra space it creates (highlighted in red). Such considerations make it difficult to construct a dependable parsing function and robustly absorb its result into subsequent phases.Input: “The link is <a href=http:”Actual Output: “The link is <a href=http: //www.google.com/search?q”Expected Output: “The link is <a href=http://www.google.com/search?q” This is the exact bucket of problems that Token Healing tries to solve using the backtracking method. Feel free to check out this jupyter notebook for more examples.3. Guaranteed Output Structure - Large language models are fantastic at producing useful outputs, but not so much at producing outputs in a specified format (especially open-source ones like LLaMa, GPT-2, and so on). When we want to use the output of a language model as input to another system, this is frequently an issue. With Handlebars, guidance guarantees the output format to be the same as what was being asked for.Let’s now see Guidance in action -InstallationInstalling guidance is a breeze, just do a pip :$ pip install guidanceAssume we are now creating a product description for an e-commerce website. Here's how the traditional generation compares to the guidance generation. Feel free to play with this colab notebook with both the below examples.Traditional GenerationInput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. { prod_id: <numeric value of 5 digits>, prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description isOutput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. { prod_id: <numeric value of 5 digits>, prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description is { resentprod_id: <numeric value of 5 digits>, resentprod_name: <name begins with the prefix 'p_'>, resentprod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } In the above example, the product description has 5 constraint fields and 5 attribute fields. The constraints are as follows: resentprod_id: - value of 5 digits, resentprod_name: - name of the product, resentprod_price: - price of the product, resentprod_price_suffix: - suffix of the product price, resentprod_id: - the product id, resentpro diabetic_id: value of 4 digits, resentprod_ astronomer_id: - value of 4 digits, resentprod_ star_id: - value of 4 digits, resentprod_is_generic: - if the product is generic and not the generic type, resentprod_type: - the type of the product, resentprod_is_generic_typeHere’s the code for the above example with GPT-2 language model -``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("gpt2-large") model = AutoModelForCausalLM.from_pretrained("gpt2-large") inputs = tokenizer(Input, return_tensors="pt") tokens = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, )Output:tokenizer.decode(tokens[0], skip_special_tokens=True)) ```Guidance GenerationInput w/ code:guidance.llm = guidance.llms.Transformers("gpt-large") # define the prompt program = guidance("""Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The following is the format ```json { "prod_id": "{{gen 'id' pattern='[0-9]{5}' stop=','}}", "prod_name": "{{gen 'name' pattern='p_[A-Za-z]+' stop=','}}", "prod_price": "{{gen 'price' pattern='\b([1-9]|1[0-6])\b\$' stop=','}}" }```""") # execute the prompt Output = program()Output:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of a fixed set of fields to be filled in the JSON. The following is the format```json { "prod_id": "11231", "prod_name": "p_pizzas", "prod_price": "11$" }```As seen in the preceding instances, with guidance, we can be certain that the output format will be followed within the given restrictions no matter how many times we execute the identical prompt. This capability makes it an excellent choice for constructing any dependable and strong multi-step LLM pipeline.I hope this overview of Guidance has helped you realize the value it may provide to your daily prompt development cycle. Also, here’s a consolidated notebook showcasing all the features of Guidance, feel free to check it out.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn

0
0
1597

article-image-setting-up-polars-for-data-analysis

Luca Zanna

23 Feb 2024

7 min read

Setting Up Polars for Data Analysis

Luca Zanna

23 Feb 2024

7 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Data Analysis with Polars, by Luca Zanna. Leverage Polars, the lightning-fast dataframe library, to take your Python data analysis skills to the next levelIntroductionIn the ever-evolving landscape of data analysis, harnessing the right tools and methodologies can make all the difference. Welcome to a world where Polars, a powerful data manipulation library, takes center stage. This article is your gateway to unlocking the potential of Polars, and it begins by unraveling the essential components of the data analysis journey. From setting up virtual environments to simplifying data analysis in the cloud with Google Colab, we explore how Polars streamlines your path to insights. Whether you're a seasoned data analyst or just starting your journey, this guide will equip you with the knowledge and tools needed to make your data analysis endeavors efficient and rewarding. Join us as we delve into the fascinating realm of Polars and embrace a new era of data exploration.Installation and virtual environments We will not go through the installation of Python as that is outside the scope of the book. A visit to python.org will give all the information necessary to install Python. Now on to virtual environments. Understanding Virtual Environments and Their Benefits Imagine you have built a fantastic data analysis project using Polars. Your project uses: Python 3.8Polars version 0.15.1 Numpy 1.23.0 Now, you start a new project, and you want to use a newer Polars (0.16.14), along with Numpy and Arrow. So, the new project requires: Python 3.10 Polars 0.16.14 Numpy 1.24.0 Pyarrow 11.0.0 Upgrading Polars and Numpy libraries globally isn't a good idea. If Polars functions have changed between versions, your first project might stop working or give incorrect results with the new version. This is where virtual environments come in. Virtual environments create separate 'spaces' for each project: one for your first data analysis project and another for your new data pipeline project. You can set up a virtual environment manually or have your IDE set-up a virtual environment for you. If you decide to set it up manually, you can check out the guide at https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#creating-a-virtual-environment. Installing and using Polars on a machine To install Polars, first make sure you are in a virtual environment. Then, type: pip install polars If you already have Polars installed and want to upgrade it, type: pip install polars --upgrade In the book we will use other libraries, including numpy, pandas, matplotlib. You can install them with the syntax above, and you can also install multiple libraries at the same tine: pip install numpy pandas matplotlib Let’s now get our development environment set-up. We will use Visual Studio code, but you are free to use any other IDE that you like. 1. Type code . in the command line to open Visual Studio Code. 2. Right-click on the left, choose New File, and create first_dataframe.ipynb. Figure – Creating a new file in Visual Studio Code Files with extension .ipynb are Jupyter Notebook files, which are great for data analysis. To work with these files you need to install the Jupyter extension on VS Code. You can do that by clicking on ‘Extensions’ on the left bar, searching for Jupyter, installing it, and activating it. Figure – Install Jupyter extension in Visual Studio Code 3. Now back to our file. The first thing to ensure is that we are using Python from our virtual environment. Click on Select Kernel at the top right, then click on the Python that starts with env/: that will be the Python for our virtual environment. Avoid the paths starting with /usr and /bin as those are the system Python instead of our virtual environment. Figure – Select the Python interpreter in Visual Studio code Now, we're ready for Polars. 4. Type import polars as pl in the first cell and press Shift + Enter to run it. 5. Create a dataframe in the next cell by typing: df = pl.DataFrame({ 'a': ['Hello', 'World!'] }) 6. Press Shift + Enter to run the cell. This creates a dataframe called df with one column named 'a' and two rows: 'Hello' and 'World!' To see the dataframe, type df in the next cell and run it. Figure – Visual Studio code with first Polars dataframe We created our first Polars dataframe. Using Polars on the cloud with Google Colab Instead of installing Polars on your computer, you can also use it in the cloud. One popular cloud service for running code is Google Colab. This way, you don't need to install anything on your machine. To access Google Colab, visit https://colab.research.google.com/ in your web browser. Click on "New Notebook," and you'll see a page that looks similar to VS Code. Now, let's create the same Polars dataframe example in Google Colab: 1. In the first cell, type the following command to ensure we have the latest version of Polars: %pip install polars --upgrade 2. Next, enter this code to import Polars and create a dataframe: import polars as pl df = pl.DataFrame({ 'a': ['Hello', 'World !'] }) Finally, display the dataframe by typing: df And that's it! You now have your first Polars dataframe in Google Colab. Figure – Google Colab with first Polars dataframe ConclusionIn closing, Polars offers a bridge to the future of data analysis. With the knowledge and hands-on experience gained from this article, you're well-prepared to conquer the intricacies of data manipulation and visualization. The ability to effortlessly create, manipulate, and analyze data using Polars is a powerful tool in your arsenal. Whether you're a data enthusiast or a seasoned analyst, embracing Polars sets you on a path toward efficiency, precision, and data-driven success. As the data landscape continues to evolve, you're now equipped to stay ahead, make informed decisions, and revolutionize your approach to data exploration.Author BioLuca Zanna is a Data Engineer and Data Analyst with over 15 years of experience. He started his career as a financial data analyst after a Master's in Management and passing the Certified Public Accountant (CPA) exam. Luca spent a decade working on financial analysis systems at L’Oréal: developing the systems and training financial analysts across Europe and Asia.Currently, Luca helps companies with building data infrastructure to better leverage their data. Luca is also a corporate teacher for topics such as data analysis, SQL, Python, and cloud data engineering.

0
0
1431

article-image-leveraging-google-cloud-for-custom-endpoint-with-openai

Henry Habib

20 Feb 2024

8 min read

Leveraging Google Cloud for Custom Endpoint with OpenAI

Henry Habib

20 Feb 2024

8 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, OpenAI API Cookbook, by Henry Habib. Integrate the ChatGPT API into various domains ranging from simple wrappers to knowledge-based assistants, multi-model, and conversational applicationsIntroductionIn the realm of application development, the integration of OpenAI's capabilities through custom backend endpoints on Google Cloud is a pivotal step towards unlocking intelligent solutions. This article explores the process of creating such endpoints using Google Cloud's Cloud Functions, allowing for control, customization, and the seamless integration of OpenAI's features. Through this fusion of technologies, developers can craft innovative applications that leverage the power of artificial intelligence to enhance user experiences and drive creativity in their projects.Creating a public endpoint server that calls the OpenAI APIThere are many important benefits of creating your own public endpoint server that calls the OpenAI API, instead of connecting to the OpenAI API directly – the biggest being control and customization, which we will explore in this recipe and the next recipe.In this recipe, we will use GCP to host our public endpoint. When this endpoint is called, it will make a request to OpenAI for a slogan for an ice cream company and then will return the answer to the user. This sounds simple and almost unnecessary to make a public endpoint, but it is the final step we need to build a truly intelligent application that leverages OpenAI.To do this, we will create a GCP resource called Cloud Functions, which we will explore later in the How it works… section of the recipe.Getting readyEnsure you have an OpenAI platform account with available usage credits. If you don’t, please follow the Setting up your OpenAI API Playground environment recipe in Chapter 1. Furthermore, ensure you have created a GCP account. To do this, navigate to https://cloud. google.com/, then select Start Free from the top right, and follow the instructions that you see.You may need to provide a billing profile as well to create any GCP resources. Note that GCP does have a free tier, and in this recipe, we will not go above the free tier (so, essentially, you should not be billed for anything).You may need to create a project if this is your first time logging into Google Cloud Platform. After you log in, select Select a project from the top left and then select New Project. Provide a project name and then select Create.The next recipe in this chapter will also have this same requirement.How to do it…1. Navigate to https://console.cloud.google.com/. In the Search field at the top of the page, type in Cloud Functions and select the top choice from the drop-down menu, Cloud Functions.Figure – Cloud Functions in the dropdown2. Select Create Function from the top of the page. This will begin to create our custom backend endpoint and start the configuration steps.On the Configuration page, fill in the following steps:Environment: Select 2nd gen from the drop-down menu.Function name: Since we’re creating a backend endpoint that will produce company slogans, the function name will be slogan_creator.Region: Choose the environment location nearest you.In the Trigger menu, choose HTTPS. In the Authentication sub-menu, select Allow unauthenticated invocation. We need to check this as we are going to create a public endpoint that will be accessible from our frontend services. Figure – Sample configuration settings of a Google Cloud Function3. Select the Next button on the bottom of the page to then move on to the Code section.4. From the Runtime dropdown, select Python 3.12. This ensures that our backend endpoint will be coded using the Python programming language.5. For that Entry point option, type in create_slogan. This refers to the name of the function in Python that is called when the public endpoint is reached and triggered.6. On the left-hand side menu, you will see two files: main.py and requirements.txt. Select the requirements.txt file. This will list all the Python packages that need to be installed for our Cloud Function to operate.7. In the center of the screen where the contents of requirements.txt are displayed, enter a new line and type in openai. This will ensure that the latest openai library package is installed. Your screen should look like what’s displayed in Figure below.Figure – Snapshot of the requirements.txt file8. From the left-hand side menu, select main.py. Copy and paste the following code into the center of the screen (where the content for that file is displayed). These are the instructions that the public endpoint will run when it is triggered:import functions_framework from openai import OpenAI @functions_framework.http def create_slogan(request): client = OpenAI(api_key = '<API Key here>') response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": "You are an AI assistant that creates one slogan based on company descriptions" }, { "role": "user", "content": "A company that sells ice cream" } ], temperature=1, max_tokens=256, top_p=1, frequency_penalty=0, presence_penalty=0 ) return response.choices[0].message.contentAs you can see, it simply calls the OpenAI endpoint, requests a chat completion, and then returns the output to the user. You will also need your OpenAI API key.9. Next, deploy the function by selecting the Deploy button at the bottom of your page.10. Wait for your function to be fully deployed, which typically takes two minutes. You can verify whether the function has been deployed or not by observing the progress in the top left section of the page (shown in Figure below). Once it is green and checkmarked, the build is successful, and your function has been deployed. Figure – The Cloud Function deployment page11. Now, let’s verify that our function works. Select the endpoint URL, found on the top of the page near URL. It’s typically in the form https://[location]-[project-name]. cloudfunctions.net/[function-name]. It is also highlighted in the above Figure.12. This will open a new web page that will trigger our custom public endpoint, and return a chat completion, which, in this case, is the slogan for an ice cream business. Note that this is a public endpoint – this will work on your computer, phone, or any device connected to the internet.Figure – Output of a Google Cloud FunctionHow it works…In this recipe, we created a public endpoint. This endpoint can be accessed by anyone (including your application in future recipes). The logic of the endpoint is simple and something we have covered prior: return a slogan for a company that sells ice cream. What’s new, however, is that this is our very own public endpoint that is hosted in Google Cloud, using the Cloud Function resource.Note that we used the free tier of Google Cloud Functions, which does have limitations such as a cap on the number of function invocations per month, limited execution time, and constrained computational resources. However, for our current purposes, these limitations are not a hindrance, allowing us to deploy and test our functions effectively without incurring costs. This setup is ideal for small-scale applications or for learning and experimentation purposes, providing a practical way to understand cloud functionalities and serverless architecture in a cost-effective manner.ConclusionIn conclusion, the synergy between Google Cloud's infrastructure and OpenAI's capabilities offers developers a powerful platform for creating intelligent applications. By leveraging Cloud Functions to build custom backend endpoints, developers can unlock a world of possibilities for innovation and creativity. This article has provided a comprehensive guide to integrating OpenAI into Google Cloud, empowering developers to craft intelligent solutions that enhance user experiences and drive the evolution of application development. With this knowledge, developers are well-equipped to embark on their journey of building intelligent applications that push the boundaries of what is possible in the digital landscape.Author BioHenry Habib is a Manager at one of the world's top management consulting firms, advising F500 companies on analytics and operations, with a particular focus on building intelligent AI-driven solutions and tools to create impact. He is a passionate online instructor and educator, amassing a of more than 150K paid students and facilitating technical programs at large banks and governmental.A proponent in the no-code and LLM revolution, he believes that anyone can now create powerful and intelligent applications without any deep technical skills. Henry resides in Toronto, Canada with his wife, and enjoys reading AI research papers and playing tennis in his free time.

0
0
1638

article-image-ai-distilled-37-cutting-edge-updates-and-expert-guidance

Merlyn Shelley

16 Feb 2024

10 min read

AI_Distilled 37: Cutting-Edge Updates and Expert Guidance

Merlyn Shelley

16 Feb 2024

10 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“[Sovereign AI] codifies your culture, your society's intelligence, your common sense, your history – you own your own data…[Use AI to] activate your industry, build the infrastructure, as fast as you can.” - Jensen Huang, NVIDIA founder and CEO Huang strongly advocated for countries to rapidly develop their own national AI capabilities and systems. NVIDIA's dominance in GPUs positions it to be a major beneficiary of the AI revolution as its technologies are fundamental for running advanced AI applications. No wonder NVIDIA's market value recently surpassed Amazon’s. Embark on a new AI journey with AI_Distilled, a curated digest of the most recent developments in AI/ML, LLMs, NLP, GPT, and Gen AI. We’ll kick things off by tapping into the latest news and developments in the AI sector: OpenAI updates ChatGPT with memory retention Microsoft unveils new Copilot features Google updates Gemini and unveils mobile app Apple's new AI model called MGIE New open-source AI model Aya converses in 100+ languages Reka AI introduces two new state-of-the-art AI models DeepMind and USC develop new technique to improve LLMs’ reasoning abilities New open-source AI model Smaug-72B achieves top spot NVIDIA unveils new chatbot Chat with RTX AI helps identify birds and conserve an English wetland USPTO issues new guidance stating AI alone can't be named as an inventor We’ve also handpicked GPT and LLM resources and secret knowledge that’ll come in handy for your next project: Building a Scalable Foundation Model Platform for Your Enterprise Making Bridges Between AI and Business Evaluating Large Language Models: A Guide to Benchmark Tests Code Generation Gets Smarter with Context Looking for hands-on tips and strategies straight from the developer community? We’ve got you covered with some incredible tutorials to get you started: Building a Question Answering Bot from Scratch Creating SMS Apps with Next.js and AI Assistants Harness the Power of LLMs Without GPUs Making the Switch to Open-Source Models Finally, feel free to check out our curated list of smoking hot GitHub repositories. arplaboratory/learning-to-fly time-series-foundation-models/lag-llama noahfarr/rlx uclaml/SPIN phidatahq/phidata 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." 📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content! Cheers, Merlyn Shelley Editor-in-Chief, Packt SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisOpenAI (whose annual revenue hit $2 billion in December) has updated its ChatGPT chatbot to retain information from past conversations, allowing it to apply context from previous discussions to new chats. This could help the bot respond with more relevant, personalized replies. Microsoft's AI assistant Copilot has also received upgrades like improved models and image editing tools. Google updated its conversational AI too with a new mobile app and advanced model (now called Gemini). A new paid tier for Gemini Ultra provides developers and users access to more advanced features and capabilities for $20 per month. Courtesy: OpenAI Apple's new AI model called MGIE allows editing images through natural language commands, performing tasks from color adjustments to complex manipulations. Interestingly, a new open-source AI model called Aya can converse in over 100 languages, potentially increasing access for many. Reka AI has introduced two new state-of-the-art AI models, Flash and Edge, which achieve top performance on language and vision tasks while Edge maintains a smaller size. Researchers from DeepMind and USC have developed a new technique called SELF-DISCOVER to improve LLMs’ reasoning abilities. A new open-source AI model (available for all) called Smaug-72B has achieved the top spot on the leaderboard for language models, demonstrating skills that surpass proprietary competitors.Courtesy: Apple NVIDIA's market value surpassed Amazon's for the first time since 2002 thanks to strong demand for its AI chips. The company also released a new chatbot called Chat with RTX allowing users to run personalized generative AI models locally on PCs. Chatbots aside, AI is making waves across fields. Conservationists are using AI to identify birds by their songs to help restore an English wetland. Scientists are speeding discoveries and tackling climate change better as new multimodal and smaller language models enhance technologies. That said, the USPTO has issued new guidance stating that while AI alone can't be named as an inventor on patents, humans can use AI in the invention process as long as they make a significant creative contribution. 🔮 Expert Insights from Packt Community LLMs Under the Hood – Building Models for Your Unique Use Cases [Video] - By Maxime Labonne, Denis Rothman, Abi Aryan This video course is an invaluable resource for AI developers looking to master the art of building enterprise-grade Large Language Models (LLMs). Here's why it's a must-watch: Key Takeaways: 1. Expert-Led Guidance: Learn from industry experts Maxime Labonne, Dennis Rothman, and Abi Aryan, who bring a wealth of experience in LLM development. 2. End-to-End Coverage: Gain comprehensive insights into the entire LLM lifecycle, from architecture to deployment. 3. Advanced Skills Development: Acquire advanced skills to architect high-performing LLMs tailored to your specific business needs. 4. Hands-On Learning: Engage in practical exercises that reinforce key concepts and techniques for building, refining, and deploying LLMs. 5. Real-World Impact: Learn how to create LLMs that deliver tangible business value and solve complex organizational challenges. Course Highlights: - Making informed architecture decisions for optimal performance. - Selecting the right model types and configuring hyperparameters effectively. - Curating high-quality training data for better model outcomes. - Mastering pre-training, fine-tuning, and rigorous model evaluation techniques. - Strategies for smooth productionization, proactive monitoring, and post-deployment maintenance. By the end of this masterclass, you'll be equipped with the practical knowledge and skills needed to develop and deploy LLMs that drive real-world impact for your organization. Watch Here 🌟 Secret Knowledge: AI/LLM Resources🌀 Building a Scalable Foundation Model Platform for Your Enterprise: This post outlines how enterprises can provide different teams governed access to powerful foundation models through a centralized API layer. The solution described captures model usage and costs for each team to enable chargebacks. It also allows controlling access and throttling usage on a per-team basis. Building on serverless AWS services ensures the solution scales to meet demand. Whether you need transparent access for innovation or just want to understand how teams are leveraging AI, implementing a solution like this can help unlock the potential of generative AI for your whole organization. 🌀 Making Bridges Between AI and Business: This article discusses how businesses can develop an AI platform to integrate generative technologies like RAG and CRAG safely. It covers collecting data, querying knowledge bases, and using prompt engineering to guide AI models. The goal is to leverage AI's potential while avoiding risks through a blended strategy of retrieval and generation. This overview provides a solid foundation for aligning cutting-edge models with your organization's needs. 🌀 Evaluating Large Language Models: A Guide to Benchmark Tests: As AI language models become more advanced, it's important we have proper ways to assess their abilities. This article outlines several benchmark tests that evaluate language models on tasks like reasoning, code generation and more. Tests like WinoGrande, Hellaswag and GLUE provide insights into models' strengths and weaknesses. The benchmarks also allow for comparisons between different models. They give us a more complete picture of a model's skills. 🌀 Code Generation Gets Smarter with Context: Google's Codey APIs now enhance code completion and generation using Retrieval Augmented Generation, which retrieves relevant code from repositories to provide more accurate responses. This "RAG" technique allows LLMs to leverage external context. The blog post explores how RAG works and demonstrates its ability to inject appropriate code snippets. While not perfect, RAG is a useful tool to explore coding variations and adapt to custom styles when used with Codey on Vertex AI. Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a Try! 🔛 Masterclass: AI/LLM Tutorials🌀 Building a Question Answering Bot from Scratch: This tutorial shows you how to create a basic question answering bot by processing text from Wikipedia, generating embeddings with OpenAI, and storing the data in Momento Vector Index. It covers initializing clients, loading and chunking data, generating embeddings, indexing the embeddings, and searching to return answers. The bot is enhanced by using GPT-3 to provide concise responses instead of raw text. Following these steps will give you hands-on experience constructing a QA system from the ground up. 🌀 Creating SMS Apps with Next.js and AI Assistants: This article shows you how to build a texting app that uses Next.js for the frontend and backend. OpenAI's GPT-3 is utilized to generate meeting invite messages. Twilio handles sending the texts. React components collect invite details. API routes fetch GPT-3 responses and send data to Twilio. It's a clever way to enhance workflows with AI. 🌀 Harness the Power of LLMs Without GPUs: Google's new localllm tool allows developers to run large language models locally using just a CPU, eliminating the need for expensive GPUs. With localllm and Cloud Workstations, you can build AI-powered apps right in your browser-based development environment. Quantized models optimize performance on CPUs while localllm handles downloading and running the models. The post provides instructions for setting up a Cloud Workstation with localllm pre-installed to get started with this new way to develop with LLMs. 🌀 Making the Switch to Open-Source Models: Hugging Face's new Messages API allows developers to easily transition chatbots and conversational models from OpenAI to open-source options like Mixtral. The API maintains compatibility with OpenAI libraries so your code doesn't need updating. You can also deploy these models to Hugging Face's Inference Endpoints and use them with frameworks like LangChain and LlamaIndex. This unlocks greater control, lower costs and more transparency compared to closed-source options. 🚀 HackHub: Trending AI Tools🌀 arplaboratory/learning-to-fly: Train end-to-end quadrotor control policies using deep reinforcement learning on a laptop in seconds 🌀 time-series-foundation-models/lag-llama: Open-source foundation model for probabilistic time series forecasting to perform zero-shot predictions on new time series data and eventually fine-tune for their specific forecasting needs 🌀 noahfarr/rlx: Implements reinforcement learning algorithms using Apple's MLX framework, making the code optimized to run on M-series chips 🌀 uclaml/SPIN: Implement Self-Play Fine-Tuning to enhance language models through self-supervised learning 🌀 phidatahq/phidata: Open-source toolkit for building AI assistants using function calling Affiliate Disclosure: This newsletter contains affiliate links. If you buy through them, we may earn a small commission at no extra cost to you. This supports our work and helps us keep providing useful content. We only recommend products and services we think will benefit our readers. Thanks for your support!

0
0
851

article-image-zapiers-ai-features-a-game-changer-for-automation

Kelly Goss

15 Feb 2024

8 min read

Zapier's AI Features: A Game-Changer for Automation

Kelly Goss

15 Feb 2024

8 min read

0
0
967

article-image-setting-up-openai-playground

Henry Habib

13 Feb 2024

9 min read

Setting Up OpenAI Playground

Henry Habib

13 Feb 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, OpenAI API Cookbook, by Henry Habib. Integrate the ChatGPT API into various domains ranging from simple wrappers to knowledge-based assistants, multi-model, and conversational applicationsIntroductionThe OpenAI Playground is an interactive web-based interface designed to allow users to experiment with OpenAI’s language models, including ChatGPT. It’s a place where you can learn about the capabilities of these models by entering prompts and seeing the responses generated in real time. This platform acts as a sandbox where developers, researchers, and curious individuals alike can experiment, learn, and even prototype their ideas.In the Playground, you have the freedom to engage in a wide range of activities. You can test out different versions of the AI models, experimenting with various prompts to see how the model responds, and you can play around with different parameters to influence the responses generated. It provides a real-time glimpse into how these powerful AI models think, react, and create based on your input.Setting up your OpenAI Playground environmentGetting readyBefore you start, you need to create an OpenAI Platform account.Navigate to https://platform.openai.com/and sign in to your OpenAI account. If you do not have an account, you can sign up for free with an email address. Alternatively, you can log in to OpenAI with a valid Google, Microsoft, or Apple account. Follow the instructions to complete the creation of your account. You may need to verify your identity with a valid phone number.How to do it…1. After you have successfully logged in, navigate to Profile in the top right-hand menu, select Personal, and then select Usage from the left-hand side menu. Alternatively, you can navigate to https://platform.openai.com/account/usage after logging in. This page shows the usage of your API, but more importantly, it shows you how many credits you have available.2. Normally, OpenAI provides you a $5 credit with a new account, which you should be able to see under the Free Trial Usage section of the page. If you do have credits, proceed to step 4. If, however, you do not have any credits, you will need to upgrade and set up a paid account.3. You need not set up a paid account if you have received free credits. If you run out of free credits, however, here is how you can set up a paid account: select Billing from the left-hand side menu and then select Overview. Then, select the Set up paid account button. You will be prompted to enter your payment details and set a dollar threshold, which can be set to any level of spend that you are comfortable with. Note that the amount of credits required to collectively execute every single recipe contained in this book is not likely to exceed $5.4. After you have created an OpenAI Platform account, you should be able to access the Playground by selecting Playground from the top menu bar, or by navigating to https://platform. openai.com/playground.How it works…The OpenAI Playground interface is, in my experience, clean, intuitive, and designed to provide users easy access to OpenAI’s powerful language models. The Playground is an excellent place to learn how the models perform under different settings, allowing you to experiment with parameters such as temperature and max tokens, which influence the randomness and length of the outputs respectively. The changes you make are instantly reflected in the model’s responses, offering immediate feedback.As shown in Figure 1.1, the Playground consists of three sections: the System Message, the Chat Log, and the Parameters. You will learn more about these three features in the Running a completion request in the OpenAI Playground recipe.Figure 1.1 – The OpenAI PlaygroundNow, your Playground is set up and ready to be used. You can use it to run completion requests and see how varying your prompts and parameters affect the response from OpenAI.Running a completion request in the OpenAI PlaygroundIn this recipe, we will actually put the Playground in action and execute a completion request from OpenAI. Here, you will see the power of the OpenAI API and how it can be used to provide completions for virtually any prompt.Getting readyEnsure you have an OpenAI Platform account with available usage credits. If you don’t, please follow the Setting up your OpenAI API Playground environment recipe. All the recipes in this chapter will have this same requirement.How to do it…Let’s go ahead and start testing the model with the Playground. Let’s create an assistant that writes marketing slogans:1. Navigate to the OpenAI Playground.2. In the System Message, type in the following: You are an assistant that creates marketing slogans based on descriptions of companies. Here, we are clearly instructing the model of its role and context.3. In the Chat Log, populate the USER message with the following: A company that writes engaging mystery novels.4. Select the Submit button on the bottom of the page.5. You should now see a completion response from OpenAI. In my case (Figure 1.2), the response isUnlock the Thrilling Pages of Suspense with Our Captivating Mystery Novels!Figure 1.2 – The OpenAI Playground with prompt and completionNoteSince OpenAI’s LLMs are probabilistic, you will likely not see the same outputs as me. In fact, if you run this recipe multiple times, you will likely see different answers, and that is expected because it is built into the randomness of the model.How it works…OpenAI’s text generation models utilize a specific neural network architecture termed a transformer. Before delving deeper into this, let’s unpack some of these terms:Neural network architecture: At a high level, this refers to a system inspired by the human brain’s interconnected neuron structure. It’s designed to recognize patterns and can be thought of as the foundational building block for many modern AI systems.Transformer: This is a type of neural network design that has proven particularly effective for understanding sequences, making it ideal for tasks involving human language. It focuses on the relationships between words and their context within a sentence or larger text segment.In machine learning, unsupervised learning typically refers to training a model without any labeled data, letting the model figure out patterns on its own. However, OpenAI’s methodology is more nuanced. The models are initially trained on a vast corpus of text data, supervised with various tasks. This helps them predict the next word in a sentence, for instance. Subsequent refinements are made using Reinforcement Learning through Human Feedback (RLHF), where the model is further improved based on feedback from human evaluators.Through this combination of techniques and an extensive amount of data, the model starts to capture the intricacies of human language, encompassing context, tone, humor, and even sarcasm.In this case, the completion response is provided based on both the System Message and the Chat Log. The System Message serves a critical role in shaping and guiding the responses you receive from Open AI, as it dictates the model’s persona, role, tone, and context, among other attributes. In our case, the System Message contains the persona we want the model to take: You are an assistant that creates marketing slogans based on descriptions of companies.The Chat Log contains the history of messages that the model has access to before providing its response, which contains our prompt, A company that writes engaging mystery novels.Finally, the parameters contain more granular settings that you can change for the model, such as temperature. These significantly change the completion response from OpenAI. We will discuss temperature and other parameters in greater detail in Chapter 3.There’s more…It is worth noting that ChatGPT does not read and understand the meaning behind text – instead, the responses are based on statistical probabilities based on patterns it observed during training.The model does not understand the text in the same way that humans do; instead, the completions are generated based on statistical associations and patterns that have been trained in the model’s neural network from a large body of similar text. Now, you know how to run completion requests with the OpenAI Playground. You can try this feature out for your own prompts and see what completions you get. Try creative prompts such as write me a song about lightbulbs or more professional prompts such as explain Newton's first law.ConclusionIn conclusion, the OpenAI Playground offers a dynamic environment for exploring the capabilities of language models like ChatGPT. By setting up your account and navigating through its features, you can unlock endless possibilities for creativity, learning, and innovation. Experiment with prompts, adjust parameters, and observe real-time responses to gain insights into AI's potential. Whether you're a developer, researcher, or curious individual, the Playground provides a sandbox for unleashing your imagination and understanding AI's intricacies. With each completion request, you delve deeper into the world of artificial intelligence, discovering its nuances and expanding your horizons. Start your journey today and witness the power of AI in action.Author BioHenry Habib is a Manager at one of the world's top management consulting firms, advising F500 companies on analytics and operations, with a particular focus on building intelligent AI-driven solutions and tools to create impact. He is a passionate online instructor and educator, amassing a of more than 150K paid students and facilitating technical programs at large banks and governmental.A proponent in the no-code and LLM revolution, he believes that anyone can now create powerful and intelligent applications without any deep technical skills. Henry resides in Toronto, Canada with his wife, and enjoys reading AI research papers and playing tennis in his free time.

0
0
6466

Elevate Your BI Dashboards with Figma

Transforming Web Data with Browse AI

ChatGPT for Data Governance

AI_Distilled #39: Unpacking Mistral Large, Google's Gemini Challenges, and Copilot Enterprise

Get Started with Fabric: Create Your Workspace & Reports

Enhancing Image Search with Vector Similarity

Using ChatGPT for Customer Service

Streamlining Insights with Microsoft Copilot

AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Getting Started with Microsoft Guidance

Trending Topics

Setting Up Polars for Data Analysis

Leveraging Google Cloud for Custom Endpoint with OpenAI

AI_Distilled 37: Cutting-Edge Updates and Expert Guidance

Zapier's AI Features: A Game-Changer for Automation

Setting Up OpenAI Playground