Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials

7019 Articles
article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-2
Alan Bernardo Palacio
31 Aug 2023
10 min read
Save for later

Building an API for Language Model Inference using Rust and Hyper - Part 2

Alan Bernardo Palacio
31 Aug 2023
10 min read
IntroductionIn our previous exploration, we delved deep into the world of Large Language Models (LLMs) in Rust. Through the lens of the llm crate and the transformative potential of LLMs, we painted a picture of the current state of AI integrations within the Rust ecosystem. But knowledge, they say, is only as valuable as its application. Thus, we transition from understanding the 'how' of LLMs to applying this knowledge in real-world scenarios.Welcome to the second part of our Rust LLM. In this article, we roll up our sleeves to architect and deploy an inference server using Rust. Leveraging the blazingly fast and efficient Hyper HTTP library, our server will not just respond to incoming requests but will think, infer, and communicate like a human. We'll guide you through the step-by-step process of setting up, routing, and serving inferences right from the server, all the while keeping our base anchored to the foundational insights from our last discussion.For developers eager to witness the integration of Rust, Hyper, and LLMs, this guide promises to be a rewarding endeavor. By the end, you'll be equipped with the tools to set up a server that can converse intelligently, understand prompts, and provide insightful responses. So, as we progress from the intricacies of the llm crate to building a real-world application, join us in taking a monumental step toward making AI-powered interactions an everyday reality.Imports and Data StructuresLet's start by looking at the import statements and data structures used in the code:use hyper::service::{make_service_fn, service_fn}; use hyper::{Body, Request, Response, Server}; use std::net::SocketAddr; use serde::{Deserialize, Serialize}; use std::{convert::Infallible, io::Write, path::PathBuf};hyper: Hyper is a fast and efficient HTTP library for Rust.SocketAddr: This is used to specify the socket address (IP and port) for the server.serde: Serde is a powerful serialization/deserialization framework in Rust.Deserialize, Serialize: Serde traits for automatic serialization and deserialization.Next, we have the data structures that will be used for deserializing JSON request data and serializing response data:#[derive(Debug, Deserialize)] struct ChatRequest { prompt: String, } #[derive(Debug, Serialize)] struct ChatResponse { response: String, }1.    ChatRequest: A struct to represent the incoming JSON request containing a prompt field.2.    ChatResponse: A struct to represent the JSON response containing a response field.Inference FunctionThe infer function is responsible for performing language model inference:fn infer(prompt: String) -> String { let tokenizer_source = llm::TokenizerSource::Embedded; let model_architecture = llm::ModelArchitecture::Llama; let model_path = PathBuf::from("/path/to/model"); let prompt = prompt.to_string(); let now = std::time::Instant::now(); let model = llm::load_dynamic( Some(model_architecture), &model_path, tokenizer_source, Default::default(), llm::load_progress_callback_stdout, ) .unwrap_or_else(|err| { panic!("Failed to load {} model from {:?}: {}", model_architecture, model_path, err); }); println!( "Model fully loaded! Elapsed: {}ms", now.elapsed().as_millis() ); let mut session = model.start_session(Default::default()); let mut generated_tokens = String::new(); // Accumulate generated tokens here let res = session.infer::<Infallible>( model.as_ref(), &mut rand::thread_rng(), &llm::InferenceRequest { prompt: (&prompt).into(), parameters: &llm::InferenceParameters::default(), play_back_previous_tokens: false, maximum_token_count: Some(140), }, // OutputRequest &mut Default::default(), |r| match r { llm::InferenceResponse::PromptToken(t) | llm::InferenceResponse::InferredToken(t) => { print!("{t}"); std::io::stdout().flush().unwrap(); // Accumulate generated tokens generated_tokens.push_str(&t); Ok(llm::InferenceFeedback::Continue) } _ => Ok(llm::InferenceFeedback::Continue), }, ); // Return the accumulated generated tokens match res { Ok(_) => generated_tokens, Err(err) => format!("Error: {}", err), } }The infer function takes a prompt as input and returns a string containing generated tokens.It loads a language model, sets up an inference session, and accumulates generated tokens.The res variable holds the result of the inference, and a closure handles each inference response.The function returns the accumulated generated tokens or an error message.Request HandlerThe chat_handler function handles incoming HTTP requests:async fn chat_handler(req: Request<Body>) -> Result<Response<Body>, Infallible> { let body_bytes = hyper::body::to_bytes(req.into_body()).await.unwrap(); let chat_request: ChatRequest = serde_json::from_slice(&body_bytes).unwrap(); // Call the `infer` function with the received prompt let inference_result = infer(chat_request.prompt); // Prepare the response message let response_message = format!("Inference result: {}", inference_result); let chat_response = ChatResponse { response: response_message, }; // Serialize the response and send it back let response = Response::new(Body::from(serde_json::to_string(&chat_response).unwrap())); Ok(response) }chat_handler asynchronously handles incoming requests by deserializing the JSON payload.It calls the infer function with the received prompt and constructs a response message.The response is serialized as JSON and sent back in the HTTP response.Router and Not Found HandlerThe router function maps incoming requests to the appropriate handlers:The router function maps incoming requests to the appropriate handlers: async fn router(req: Request<Body>) -> Result<Response<Body>, Infallible> { match (req.uri().path(), req.method()) { ("/api/chat", &hyper::Method::POST) => chat_handler(req).await, _ => not_found(), } }router matches incoming requests based on the path and HTTP method.If the path is "/api/chat" and the method is POST, it calls the chat_handler.If no match is found, it calls the not_found function.Main FunctionThe main function initializes the server and starts listening for incoming connections:#[tokio::main] async fn main() { println!("Server listening on port 8083..."); let addr = SocketAddr::from(([0, 0, 0, 0], 8083)); let make_svc = make_service_fn(|_conn| { async { Ok::<_, Infallible>(service_fn(router)) } }); let server = Server::bind(&addr).serve(make_svc); if let Err(e) = server.await { eprintln!("server error: {}", e); } }In this section, we'll walk through the steps to build and run the server that performs language model inference using Rust and the Hyper framework. We'll also demonstrate how to make a POST request to the server using Postman.1.     Install Rust: If you haven't already, you need to install Rust on your machine. You can download Rust from the official website: https://www.rust-lang.org/tools/install2.     Create a New Rust Project: Create a new directory for your project and navigate to it in the terminal. Run the following command to create a new Rust project: cargo new language_model_serverThis command will create a new directory named language_model_server containing the basic structure of a Rust project.3.     Add Dependencies: Open the Cargo.toml file in the language_model_server directory and add the required dependencies for Hyper and other libraries.    Your Cargo.toml file should look something like this: [package] name = "llm_handler" version = "0.1.0" edition = "2018" [dependencies] hyper = {version = "0.13"} tokio = { version = "0.2", features = ["macros", "rt-threaded"]} serde = {version = "1.0", features = ["derive"] } serde_json = "1.0" llm = { git = "<https://github.com/rustformers/llm.git>" } rand = "0.8.5"Make sure to adjust the version numbers according to the latest versions available.4.     Replace Code: Replace the content of the src/main.rs file in your project directory with the code you've been provided in the earlier sections.5.     Building the Server: In the terminal, navigate to your project directory and run the following command to build the server: cargo build --releaseThis will compile your code and produce an executable binary in the target/release directory.Running the Server1.     Running the Server: After building the server, you can run it using the following command: cargo run --releaseYour server will start listening on the port 8083.2.     Accessing the Server: Open a web browser and navigate to http://localhost:8083. You should see the message "Not Found" indicating that the server is up and running.Making a POST Request Using Postman1.     Install Postman: If you don't have Postman installed, you can download it from the official website: https://www.postman.com/downloads/2.     Create a POST Request:o   Open Postman and create a new request.o   Set the request type to "POST".o   Enter the URL: http://localhost:8083/api/chato   In the "Body" tab, select "raw" and set the content type to "JSON (application/json)".o   Enter the following JSON request body: { "prompt": "Rust is an amazing programming language because" }3.     Send the Request: Click the "Send" button to make the POST request to your server. 4.     View the Response: You should receive a response from the server, indicating the inference result generated by the language model.ConclusionIn the previous article, we introduced the foundational concepts, setting the stage for the hands-on application we delved into this time. In this article, our main goal was to bridge theory with practice. Using the llm crate alongside the Hyper library, we embarked on a mission to create a server capable of understanding and executing language model inference. But our work was more than just setting up a server; it was about illustrating the synergy between Rust, a language famed for its safety and concurrency features, and the vast world of AI.What's especially encouraging is how this project can serve as a springboard for many more innovations. With the foundation laid out, there are numerous avenues to explore, from refining the server's performance to integrating more advanced features or scaling it for larger audiences.If there's one key takeaway from our journey, it's the importance of continuous learning and experimentation. The tech landscape is ever-evolving, and the confluence of AI and programming offers a fertile ground for innovation.As we conclude this series, our hope is that the knowledge shared acts as both a source of inspiration and a practical guide. Whether you're a seasoned developer or a curious enthusiast, the tools and techniques we've discussed can pave the way for your own unique creations. So, as you move forward, keep experimenting, iterating, and pushing the boundaries of what's possible. Here's to many more coding adventures ahead!Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 239

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-1
Alan Bernardo Palacio
31 Aug 2023
7 min read
Save for later

Building an API for Language Model Inference using Rust and Hyper - Part 1

Alan Bernardo Palacio
31 Aug 2023
7 min read
IntroductionIn the landscape of artificial intelligence, the capacity to bring sophisticated Large Language Models (LLMs) to commonplace applications has always been a sought-after goal. Enter LLM, a groundbreaking Rust library crafted by Rustformers, designed to make this dream a tangible reality. By focusing on the intricate synergy between the LLM library and the foundational GGML project, this toolset pushes the boundaries of what's possible, enabling AI enthusiasts to harness the sheer might of LLMs on conventional CPUs. This shift in dynamics owes much to GGML's pioneering approach to model quantization, streamlining computational requirements without sacrificing performance.In this comprehensive guide, we'll embark on a journey that starts with understanding the essence of the llm crate and its seamless interaction with a myriad of LLMs. Delving into its intricacies, we'll illuminate how to integrate, interact, and infer using these models. And as a tantalizing glimpse into the realm of practical application, our expedition won't conclude here. In the subsequent installment, we'll rise to the challenge of crafting a web server in Rust—one that confidently runs inference directly on a CPU, making the awe-inspiring capabilities of AI not just accessible, but an integral part of our everyday digital experiences.This is a two-part article in the first section we will discuss the basic interaction with the library and in the following we build a server in Rust that allow us to build our own web applications using state-of-the-art LLMs. Let’s begin with it.Harnessing the Power of Large Language ModelsAt the very core of LLM's architecture resides the GGML project, a tensor library meticulously crafted in the C programming language. GGML, short for "General GPU Machine Learning," serves as the bedrock of LLM, enabling the intricate orchestration of large language models. Its quintessence lies in a potent technique known as model quantization.Model quantization, a pivotal process employed by GGML, involves the reduction of numerical precision within a machine-learning model. This entails transforming the conventional 32-bit floating-point numbers frequently used for calculations into more compact representations such as 16-bit or even 8-bit integers.Quantization can be considered as the act of chiseling away unnecessary complexities while sculpting a model. Model quantization adeptly streamlines resource utilization without inordinate compromises on performance. By default, models lean on 32-bit floating-point numbers for their arithmetic operations. With quantization, this intricacy is distilled into more frugal formats, such as 16-bit integers or even 8-bit integers. It's an artful equilibrium between computational efficiency and performance optimization.GGML's versatility can be seen through a spectrum of quantization strategies: spanning 4, 5, and 8-bit quantization. Each strategy allows for improvement in efficiency and execution in different ways. For instance, 4-bit quantization thrives in memory and computational frugality, although it could potentially induce a performance decrease compared to the broader 8-bit quantization.The Rustformers library allows to integration of different language models including Bloom, GPT-2, GPT-J, GPT-NeoX, Llama, and MPT. To use these models within the Rustformers library, they undergo a transformation to align with GGML's technical underpinnings. The authorship has generously provided pre-engineered models on the Hugging Face platform, facilitating seamless integration.In the next sections, we will use the llm crate to run inference on LLM models like Llama. The realm of AI innovation is beckoning, and Rustformers' LLM, fortified by GGML's techniques, forms an alluring gateway into its intricacies.Getting Started with LLM-CLIThe Rustformers group has the mission of amplifying access to the prowess of large language models (LLMs) at the forefront of AI evolution. The group focuses on harmonizing with the rapidly advancing GGML ecosystem – a C library harnessed for quantization, enabling the execution of LLMs on CPUs. The trajectory extends to supporting diverse backends, embracing GPUs, Wasm environments, and more.For Rust developers venturing into the realm of LLMs, the key to unlocking this potential is the llm crate – the gateway to Rustformers' innovation. Through this crate, Rust developers interface with LLMs effortlessly. The "llm" project also offers a streamlined CLI for interacting with LLMs and examples showcasing its integration into Rust projects. More insights can be gained from the GitHub repository or its official documentation for released versions.To embark on your LLM journey, initiate by installing the LLM-CLI package. This package materializes the model's essence onto your console, allowing for direct inference.Getting started is a streamlined process:Clone the repository.Install the llm-cli tool from the repository.Download your chosen models from Hugging Face. In our illustration, we employ the Llama model with 4-bit quantization.Run inference on the model using the CLI tool and reference the model and architecture of the model downloaded previously.So let’s start with it. First, let's install llm-cli using this command:cargo install llm-cli --git <https://github.com/rustformers/llm>Next, we proceed by fetching your desired model from Hugging Face:curl -LO <https://huggingface.co/rustformers/open-llama-ggml/resolve/main/open_llama_3b-f16.bin>Finally, we can initiate a dialogue with the model using a command akin to:llm infer -a llama -m open_llama_3b-f16.bin -p "Rust is a cool programming language because"We can see how the llm crate stands to facilitate seamless interactions with LLMs.This project empowers developers with streamlined CLI tools, exemplifying the LLM integration into Rust projects. With installation and model preparation effortlessly explained, the journey toward LLM proficiency commences. As we transition to the culmination of this exploration, the power of LLMs is within reach, ready to reshape the boundaries of AI engagement.Conclusion: The Dawn of Accessible AI with Rust and LLMIn this exploration, we've delved deep into the revolutionary Rust library, LLM, and its transformative potential to bring Large Language Models (LLMs) to the masses. No longer is the prowess of advanced AI models locked behind the gates of high-end GPU architectures. With the symbiotic relationship between the LLM library and the underlying GGML tensor architecture, we can seamlessly run language models on standard CPUs. This is made possible largely by the potent technique of model quantization, which GGML has incorporated. By optimizing the balance between computational efficiency and performance, models can now run in environments that were previously deemed infeasible.The Rustformers' dedication to the cause shines through their comprehensive toolset. Their offerings extend from pre-engineered models on Hugging Face, ensuring ease of integration, to a CLI tool that simplifies the very interaction with these models. For Rust developers, the horizon of AI integration has never seemed clearer or more accessible.As we wrap up this segment, it's evident that the paradigm of AI integration is rapidly shifting. With tools like the llm crate, developers are equipped with everything they need to harness the full might of LLMs in their Rust projects. But the journey doesn't stop here. In the next part of this series, we venture beyond the basics, and into the realm of practical application. Join us as we take a leap forward, constructing a web server in Rust that leverages the llm crate.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 210

article-image-spark-and-langchain-for-data-analysis
Alan Bernardo Palacio
31 Aug 2023
12 min read
Save for later

Spark and LangChain for Data Analysis

Alan Bernardo Palacio
31 Aug 2023
12 min read
IntroductionIn today's data-driven world, the demand for extracting insights from large datasets has led to the development of powerful tools and libraries. Apache Spark, a fast and general-purpose cluster computing system, has revolutionized big data processing. Coupled with LangChain, a cutting-edge library built atop advanced language models, you can now seamlessly combine the analytical capabilities of Spark with the natural language interaction facilitated by LangChain. This article introduces Spark, explores the features of LangChain, and provides practical examples of using Spark with LangChain for data analysis.Understanding Apache SparkThe processing and analysis of large datasets have become crucial for organizations and individuals alike. Apache Spark has emerged as a powerful framework that revolutionizes the way we handle big data. Spark is designed for speed, ease of use, and sophisticated analytics. It provides a unified platform for various data processing tasks, such as batch processing, interactive querying, machine learning, and real-time stream processing.At its core, Apache Spark is an open-source, distributed computing system that excels at processing and analyzing large datasets in parallel. Unlike traditional MapReduce systems, Spark introduces the concept of Resilient Distributed Datasets (RDDs), which are immutable distributed collections of data. RDDs can be transformed and operated upon using a wide range of high-level APIs provided by Spark, making it possible to perform complex data manipulations with ease.Key Components of SparkSpark consists of several components that contribute to its versatility and efficiency:Spark Core: The foundation of Spark, responsible for tasks such as task scheduling, memory management, and fault recovery. It also provides APIs for creating and manipulating RDDs.Spark SQL: A module that allows Spark to work seamlessly with structured data using SQL-like queries. It enables users to interact with structured data through the familiar SQL language.Spark Streaming: Enables real-time stream processing, making it possible to process and analyze data in near real-time as it arrives in the system.MLlib (Machine Learning Library): A scalable machine learning library built on top of Spark, offering a wide range of machine learning algorithms and tools.GraphX: A graph processing library that provides abstractions for efficiently manipulating graph-structured data.Spark DataFrame: A higher-level abstraction on top of RDDs, providing a structured and more optimized way to work with data. DataFrames offer optimization opportunities, enabling Spark's Catalyst optimizer to perform query optimization and code generation.Spark's distributed computing architecture enables it to achieve high performance and scalability. It employs a master/worker architecture where a central driver program coordinates tasks across multiple worker nodes. Data is distributed across these nodes, and tasks are executed in parallel on the distributed data.We will be diving into two different types of interaction with Spark, SparkSQL, and Spark Data Frame. Apache Spark is a distributed computing framework with Spark SQL as one of its modules for structured data processing. Spark DataFrame is a distributed collection of data organized into named columns, offering a programming abstraction similar to data frames in R or Python but optimized for distributed processing. It provides a functional programming API, allowing operations like select(), filter(), and groupBy(). On the other hand, Spark SQL allows users to run unmodified SQL queries on Spark data, integrating seamlessly with DataFrames and offering a bridge to BI tools through JDBC/ODBC.Both Spark DataFrame and Spark SQL leverage the Catalyst optimizer for efficient query execution. While DataFrames are preferred for programmatic APIs and functional capabilities, Spark SQL is ideal for ad-hoc querying and users familiar with SQL. The choice between them often hinges on the specific use case and the user's familiarity with either SQL or functional programming.In the next sections, we will explore how LangChain complements Spark's capabilities by introducing natural language interactions through agents.Introducing Spark Agent to LangChainLangChain, a dynamic library built upon the foundations of modern Language Model (LLM) technologies, is a pivotal addition to the world of data analysis. It bridges the gap between the power of Spark and the ease of human language interaction.LangChain harnesses the capabilities of advanced LLMs like ChatGPT and HuggingFace-hosted Models. These language models have proven their prowess in understanding and generating human-like text. LangChain capitalizes on this potential to enable users to interact with data and code through natural language queries.Empowering Data AnalysisThe introduction of the Spark Agent to LangChain brings about a transformative shift in data analysis workflows. Users are now able to tap into the immense analytical capabilities of Spark through simple daily language. This innovation opens doors for professionals from various domains to seamlessly explore datasets, uncover insights, and derive value without the need for deep technical expertise.LangChain acts as a bridge, connecting the technical realm of data processing with the non-technical world of language understanding. It empowers individuals who may not be well-versed in coding or data manipulation to engage with data-driven tasks effectively. This accessibility democratizes data analysis and makes it inclusive for a broader audience.The integration of LangChain with Spark involves a thoughtful orchestration of components that work in harmony to bring human-language interaction to the world of data analysis. At the heart of this integration lies the collaboration between ChatGPT, a sophisticated language model, and PythonREPL, a Python Read-Evaluate-Print Loop. The workflow is as follows:ChatGPT receives user queries in natural language and generates a Python command as a solution.The generated Python command is sent to PythonREPL for execution.PythonREPL executes the command and produces a result.ChatGPT takes the result from PythonREPL and translates it into a final answer in natural language.This collaborative process can repeat multiple times, allowing users to engage in iterative conversations and deep dives into data analysis.Several keynotes ensure a seamless interaction between the language model and the code execution environment:Initial Prompt Setup: The initial prompt given to ChatGPT defines its behavior and available tooling. This prompt guides ChatGPT on the desired actions and toolkits to employ.Connection between ChatGPT and PythonREPL: Through predefined prompts, the format of the answer is established. Regular expressions (regex) are used to extract the specific command to execute from ChatGPT's response. This establishes a clear flow of communication between ChatGPT and PythonREPL.Memory and Conversation History: ChatGPT does not possess a memory of past interactions. As a result, maintaining the conversation history locally and passing it with each new question is essential to maintaining context and coherence in the interaction.In the upcoming sections, we'll explore practical use cases that illustrate how this integration manifests in the real world, including interactions with Spark SQL and Spark DataFrames.The Spark SQL AgentIn this section, we will walk you through how to interact with Spark SQL using natural language, unleashing the power of Spark for querying structured data.Let's walk through a few hands-on examples to illustrate the capabilities of the integration:Exploring Data with Spark SQL Agent:Querying the dataset to understand its structure and metadata.Calculating statistical metrics like average age and fare.Extracting specific information, such as the name of the oldest survivor.Analyzing Dataframes with Spark DataFrame Agent:Counting rows to understand the dataset size.Analyzing the distribution of passengers with siblings.Computing descriptive statistics like the square root of average age.By interacting with the agents and experimenting with natural language queries, you'll witness firsthand the seamless fusion of advanced data processing with user-friendly language interactions. These examples demonstrate how Spark and LangChain can amplify your data analysis efforts, making insights more accessible and actionable.Before diving into the magic of Spark SQL interactions, let's set up the necessary environment. We'll utilize LangChain's SparkSQLToolkit to seamlessly bridge between Spark and natural language interactions. First, make sure you have your API key for OpenAI ready. You'll need it to integrate the language model.from langchain.agents import create_spark_sql_agent from langchain.agents.agent_toolkits import SparkSQLToolkit from langchain.chat_models import ChatOpenAI from langchain.utilities.spark_sql import SparkSQL import os # Set up environment variables for API keys os.environ['OPENAI_API_KEY'] = 'your-key'Now, let's get hands-on with Spark SQL. We'll work with a Titanic dataset, but you can replace it with your own data. First, create a Spark session, define a schema for the database, and load your data into a Spark DataFrame. We'll then create a table in Spark SQL to enable querying.from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() schema = "langchain_example" spark.sql(f"CREATE DATABASE IF NOT EXISTS {schema}") spark.sql(f"USE {schema}") csv_file_path = "titanic.csv" table = "titanic" spark.read.csv(csv_file_path, header=True, inferSchema=True).write.saveAsTable(table) spark.table(table).show() Now, let's initialize the Spark SQL Agent. This agent acts as your interactive companion, enabling you to query Spark SQL tables using natural language. We'll create a toolkit that connects LangChain, the SparkSQL instance, and the chosen language model (in this case, ChatOpenAI).from langchain.agents import AgentType spark_sql = SparkSQL(schema=schema) llm = ChatOpenAI(temperature=0, model="gpt-4-0613") toolkit = SparkSQLToolkit(db=spark_sql, llm=llm, handle_parsing_errors="Check your output and make sure it conforms!") agent_executor = create_spark_sql_agent(    llm=llm,    toolkit=toolkit,    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,    verbose=True,    handle_parsing_errors=True)Now comes the exciting part—querying Spark SQL tables using natural language! With your Spark SQL Agent ready, you can ask questions about your data and receive insightful answers. Let's try a few examples:# Describe the Titanic table agent_executor.run("Describe the titanic table") # Calculate the square root of the average age agent_executor.run("whats the square root of the average age?") # Find the name of the oldest survived passenger agent_executor.run("What's the name of the oldest survived passenger?") With these simple commands, you've tapped into the power of Spark SQL using natural language. The Spark SQL Agent makes data exploration and querying more intuitive and accessible than ever before.The Spark DataFrame AgentIn this section, we'll dive into another facet of LangChain's integration with Spark—the Spark DataFrame Agent. This agent leverages the power of Spark DataFrames and natural language interactions to provide an engaging and insightful way to analyze data.Before we begin, make sure you have a Spark session set up and your data loaded into a DataFrame. For this example, we'll use the Titanic dataset. Replace csv_file_path with the path to your own data if needed.from langchain.llms import OpenAI from pyspark.sql import SparkSession from langchain.agents import create_spark_dataframe_agent spark = SparkSession.builder.getOrCreate() csv_file_path = "titanic.csv" df = spark.read.csv(csv_file_path, header=True, inferSchema=True) df.show()Initializing the Spark DataFrame AgentNow, let's unleash the power of the Spark DataFrame Agent! This agent allows you to interact with Spark DataFrames using natural language queries. We'll initialize the agent by specifying the language model and the DataFrame you want to work with.# Initialize the Spark DataFrame Agent agent = create_spark_dataframe_agent(llm=OpenAI(temperature=0), df=df, verbose=True)With the agent ready, you can explore your data using natural language queries. Let's dive into a few examples:# Count the number of rows in the DataFrame agent.run("how many rows are there?") # Find the number of people with more than 3 siblings agent.run("how many people have more than 3 siblings") # Calculate the square root of the average age agent.run("whats the square root of the average age?")Remember that the Spark DataFrame Agent under the hood uses generated Python code to interact with Spark. While it's a powerful tool for interactive analysis, ensures that the generated code is safe to execute, especially in a sensitive environment.In this final section, let's tie everything together and showcase how Spark and LangChain work in harmony to unlock insights from data. We've covered the Spark SQL Agent and the Spark DataFrame Agent, so now it's time to put theory into practice.In conclusion, the combination of Spark and LangChain transcends the traditional boundaries of technical expertise, enabling data enthusiasts of all backgrounds to engage with data-driven tasks effectively. Through the Spark SQL Agent and Spark DataFrame Agent, LangChain empowers users to interact, explore, and analyze data using the simplicity and familiarity of natural language. So why wait? Dive in and unlock the full potential of your data analysis journey with the synergy of Spark and LangChain.ConclusionIn this article, we've delved into the world of Apache Spark and LangChain, two technologies that synergize to transform how we interact with and analyze data. By bridging the gap between technical data processing and human language understanding, Spark and LangChain enable users to derive meaningful insights from complex datasets through simple, natural language queries. The Spark SQL Agent and Spark DataFrame Agent presented here demonstrate the potential of this integration, making data analysis more accessible to a wider audience. As both technologies continue to evolve, we can expect even more powerful capabilities for unlocking the true potential of data-driven decision-making. So, whether you're a data scientist, analyst, or curious learner, harnessing the power of Spark and LangChain opens up a world of possibilities for exploring and understanding data in an intuitive and efficient manner.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 1404

article-image-unleashing-the-power-of-wolfram-alpha-api-with-python-and-chatgpt
Alan Bernardo Palacio
31 Aug 2023
6 min read
Save for later

Unleashing the Power of Wolfram Alpha API with Python and ChatGPT

Alan Bernardo Palacio
31 Aug 2023
6 min read
IntroductionIn the ever-evolving landscape of artificial intelligence, a groundbreaking collaboration has emerged between Wolfram Alpha and ChatGPT, giving birth to an extraordinary plugin: the AI Advantage. This partnership bridges the gap between ChatGPT's proficiency in natural language processing and Wolfram Alpha's computational prowess. The result? A fusion that unlocks an array of new possibilities, revolutionizing the way we interact with AI. In this hands-on tutorial, we're embarking on a journey to explore the power of the Wolfram Alpha API, demonstrate its integration with Python and ChatGPT, and empower you to tap into this dynamic duo for tasks ranging from complex calculations to real-time data retrieval.Understanding Wolfram Alpha APIImagine having an intelligent assistant at your fingertips, capable of not only understanding your questions but also providing detailed computational insights. That's where Wolfram Alpha shines. It's more than just a search engine; it's a computational knowledge engine. Whether you need to solve a math problem, retrieve real-time data, or generate visual content, Wolfram Alpha has you covered. Its unique ability to compute answers based on structured data sets it apart from traditional search engines.So, how can you tap into this treasure trove of computational knowledge? Enter the Wolfram Alpha API. This API exposes Wolfram Alpha's capabilities for developers to harness in their applications. Whether you're building a chatbot, a data analysis tool, or an educational resource, the Wolfram Alpha API can provide you with instant access to accurate and in-depth information. The API supports a wide range of queries, from straightforward calculations to complex data retrievals, making it a versatile tool for various use cases.Integrating Wolfram Alpha API with ChatGPTChatGPT's strength lies in its ability to understand and generate human-like text based on input. However, when it comes to intricate calculations or pulling real-time data, it benefits from a partner like Wolfram Alpha. By integrating the two, you create a dynamic synergy where ChatGPT can effortlessly tap into Wolfram Alpha's computational engine to provide accurate and data-driven responses. This collaboration bridges the gap between language understanding and computation, resulting in a well-rounded AI interaction.Before we dive into the technical implementation, let's get you set up to take advantage of the Wolfram Alpha plugin for ChatGPT. First, ensure you have access to ChatGPT+. To enable the Wolfram plugin, follow these steps:Open the ChatGPT interface.Navigate to "Settings."Look for the "Beta Features" section.Enable "Plugins" under the GPT-4 options.Once "Plugins" is enabled, locate and activate the Wolfram plugin.With the plugin enabled you're ready to harness the combined capabilities of ChatGPT and Wolfram Alpha API, making your AI interactions more robust and informative.In the next sections, we'll dive into practical applications and walk you through implementing the integration using Python and ChatGPT.Practical Applications with Code ExamplesLet's start by exploring how the Wolfram Alpha API can assist with complex mathematical tasks. Below are code examples that demonstrate the integration between ChatGPT and Wolfram Alpha to solve intricate math problems. In these scenarios, ChatGPT serves as the bridge between you and Wolfram Alpha, seamlessly delivering accurate solutions.Before diving into the code implementation, let's ensure your environment is ready to go. Follow these steps to set up the necessary components:Install the required packages: Make sure you have the necessary Python packages installed. You can use pip to install them:pip install langchain openai wolframalphaNow, let's walk through implementing the code example you provided earlier. This code integrates the Wolfram Alpha API with ChatGPT to provide accurate and informative responses:Wolfram Alpha can solve simple arithmetic queries:# User input question = "Solve for x: 2x + 5 = 15" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response result = response['text'] print("Solution:", result)Or mode complex ones like calculating integrals:# User input question = "Calculate the integral of x^2 from 0 to 5" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Integral:", response) Real-time Data RetrievalIncorporating real-time data into conversations can greatly enhance the value of AI interactions. Here are code examples showcasing how to retrieve up-to-date information using the Serper API and integrate it seamlessly into the conversation:# User input question = "What's the current exchange rate between USD and EUR?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Exchange Rate:", response)We can also ask for the current weather forecast:# User input question = "What's the weather forecast for London tomorrow?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Weather Forecast:", response)Now we can put everything together into a single block including all the required library imports and use both real time data with Serper and use the reasoning skills of Wolfram Alpha.# Import required libraries from langchain.agents import load_tools, initialize_agent from langchain.llms import OpenAI from langchain.memory import ConversationBufferMemory from langchain.chat_models import ChatOpenAI # Set environment variables import os os.environ['OPENAI_API_KEY'] = 'your-key' os.environ['WOLFRAM_ALPHA_APPID'] = 'your-key' os.environ["SERPER_API_KEY"] = 'your-key' # Initialize the ChatGPT model llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo") # Load tools and set up memory tools = load_tools(["google-serper", "wolfram-alpha"], llm=llm) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) # Initialize the agent agent_chain = initialize_agent(tools, llm, handle_parsing_errors=True, verbose=True, memory=memory) # Interact with the agent response_weather = agent_chain.run(input="what is the weather in Amsterdam right now in celcius? Don't make assumptions.") response_flight = agent_chain.run(input="What's a good price for a flight from JFK to AMS this weekend? Express the price in Euros. Don't make assumptions.")ConclusionIn this tutorial, we've delved into the exciting realm of integrating the Wolfram Alpha API with Python and ChatGPT. We've explored how this collaboration empowers you to tackle complex mathematical tasks and retrieve real-time data seamlessly. By harnessing the capabilities of both Wolfram Alpha and ChatGPT, you've unlocked a powerful synergy that's capable of transforming your AI interactions. As you continue to explore and experiment with this integration, you'll discover new ways to enhance your interactions and leverage the strengths of each tool. So, why wait? Start your journey toward more informative and engaging AI interactions today.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 3769
Banner background image

article-image-transformer-building-blocks
Saeed Dehqan
29 Aug 2023
22 min read
Save for later

Transformer Building Blocks

Saeed Dehqan
29 Aug 2023
22 min read
IntroductionTransformers employ potent techniques to preprocess tokens before sequentially inputting them into a neural network, aiding in the selection of the next token. At the transformer's apex is a basic neural network, the transformer head. The text generator model processes input tokens and generates a probability distribution for subsequent tokens. Context length, termed context length or block size, is recognized as a hyperparameter denoting input token count. The model's primary aim is to predict the next token based on input tokens (referred to as context tokens or context windows). Our goal with n tokens is to predict the subsequent fitting token following previous ones. Thus, we rely on these n tokens to anticipate the next. As humans, we attempt to grasp the conversation's context - our location and a loose foresight of the path's culmination. Upon gathering pertinent insights, relevant words emerge, while irrelevant ones fade, enabling us to choose the next word with precision. We occasionally err but backtrack, a luxury transformers lack. If they incorrectly predict (an irrelevant token), they persist, though exceptions exist, like beam search. Unlike us, transformers can't forecast. Revisiting n prior tokens, our human assessment involves inspecting them individually, and discerning relationships from diverse angles. By prioritizing pivotal tokens and disregarding superfluous ones, we evaluate tokens within various contexts. We scrutinize all n prior tokens individually, ready to prognosticate. This embodies the essence of the multihead attention mechanism in transformers. Consider a context window with 5 tokens. Each wears a distinct mask, predicting its respective next token:"To discern the void amidst, we must first grasp the fullness within." To understand what token is lacking, we must first identify what we are and possess. We need communication between tokens since tokens don’t know each other yet and in order to predict their own next token, they first need to know each other well and pair together in such a way that tokens with similar characteristics stay near each other (technically having similar vectors). Each token has three vectors that represent:●    What tokens they are looking for (known as query)●    What they really have (known as key)●    What they are (known as value)Each token with its query starts looking for similar keys, finds each other, and starts to know one another by adding up their values:Similar tokens find each other and if a token is somehow dissimilar, here Token 4, other tokens don’t consider it much. But please note that every token (much or less) has its own effect on other tokens. Also, in self-attention, all tokens ask all other tokens with their query and keys to find familiar tokens, but not the future tokens, named masked self-attention. We prohibit tokens from communicating to future tokens. After exchanging information between tokens and mixing up their values, similar tokens become more similar:As you can see, the color of similar tokens becomes more similar(in action, their vectors become more similar). Since tokens in the group wear a mask, we cannot access the true tokens’ values. We just know and distinguish them from their mask(value). This is because every token has different characteristics in different contexts, and they don’t show their true essence.So far so good; we have finished the self-attention process and now, the group is ready to predict their next tokens. This is because individuals are aware of each other very well, and as a result, they can guess the next token better. Now, each token separately needs to go to a nonlinear network and then to the transformer head, to predict its own next token. We ask each one of the tokens separately to tell their opinion about the probability of what token comes next. Finally, we collect the probability distributions of all tokens in the context window. A probability distribution sums up to 100, or actually in action to 1. We give probability to every token the model has in its vocabulary. The simplest method to extract the next token from probability distributions is to select the one with the highest probability:As you can see, each token goes to the neural network and the network returns a probability distribution. The result is the following sentence: “It looks like a bug”.Voila! We managed to go through a simple Transformer model.Let’s recap everything we’ve said. A transformer receives n tokens as input, does some stuff (like self-attention, layer normalization, etc.) and feed-forward them into a neural network to get probability distributions of the next token. Each token goes to the neural network separately; if the number of tokens is 10, there are 10 probability distributions.At this point, you know intuitively how the main building blocks of a transformer work. But let us better understand them by implementing a transformer model.Clone the repository tiny-transformer:git clone https://github.com/saeeddhqan/tiny-transformerExecute simple_model.py in the repository If you simply want to run the model for training.Create a new file, and import the necessary modules:import math import torch import torch.nn as nn import torch.nn.functional as F Load the dataset and write the tokenizer: with open('shakespeare.txt') as fp: text = fp.read() chars = sorted(list(set(text))) vocab_size = len(chars) stoi = {c:i for i,c in enumerate(chars)} itos = {i:c for c,i in stoi.items()} encode = lambda s: [stoi[x] for x in s] decode = lambda e: ''.join([itos[x] for x in e])●    Open the dataset, and define a variable that is a list of all unique characters in the text.●    The set function splits the text character by character and then removes duplicates, just like sets in set theory. list(set(myvar)) is a way of removing duplicates in a list or string.●    vocab_size is the number of unique characters (here 65). ●    stoi is a dictionary where its keys are characters and values are their indices.●    itos is used to convert indices to characters. ●    encode function receives a string and returns indices of characters. ●    decode receives a list of indices and returns a string. Split the dataset into test and train and write a function that returns data for training:device = 'cuda' if torch.cuda.is_available() else 'cpu' torch.manual_seed(1234) data = torch.tensor(encode(text), dtype=torch.long).to(device) train_split = int(0.9 * len(data)) train_data = data[:train_split] test_data = data[train_split:] def get_batch(split='train', block_size=16, batch_size=1) -> 'Create a random batch and returns batch along with targets': data = train_data if split == 'train' else test_data ix = torch.randint(len(data) - block_size, (batch_size,)) x = torch.stack([data[i:i + block_size] for i in ix]) y = torch.stack([data[i+1:i + block_size + 1] for i in ix]) return x, y●    Choose a suitable device.●     Set a seed to make the training reproducible.●    Convert the text into a large list of indices with the encode function.●    Since the character indices are integer, we use torch.long data type to make the data suitable for the model. ●    90% for training and 10% for testing.●    If the batch_size is 10, we select 10 chunks or sequences from the dataset and stack them up to process them simultaneously. ●    If the batch_size is 1, get_batch function selects 1 random chunk (n consequence characters) from the dataset and returns x and y, where x is 16 characters’ indices and y is the target characters for x.The shape, value, and decoded version of the selected chunk are as follows:shape x: torch.Size([1, 16]) shape y: torch.Size([1, 16]) value x: tensor([[41, 43, 6, 1, 60, 47, 50, 50, 39, 47, 52, 2, 1, 52, 43, 60]]) value y: tensor([[43, 6, 1, 60, 47, 50, 50, 39, 47, 52, 2, 1, 52, 43, 60, 43]]) decoded x: ce, villain! nev decoded y: e, villain! neveWe usually process multiple chunks or sequences at once with batching in order to speed up the training. For each character, we have an equivalent target, which is its next token. The target for ‘c’ is ‘e’, for ‘e’ is ‘,’, for ‘v’ is ‘i’, and so on. Let us talk a bit about the input shape and output shape of tensors in a transformer model. The model receives a list of token indices like the above(named a sequence, or chunk) and maps them into their corresponding vectors.●    The input shape is (batch_size, block_size).●    After mapping indices into vectors, the data shape becomes (batch_size, block_size, embed_size).●    Then, through the multihead attention and feed-forward layers, the data shape does not change.●    Finally, the data with shape (batch_size, block_size, embed_size) goes to the transformer head (a simple neural network) and the output shape becomes (batch_size, block_size, vocab_size). vocab_size is the number of unique characters that can come next (for the Shakespeare dataset, the number of unique characters is 65).Self-attentionThe communication between tokens happens in the head class; we define the scores variable to save the similarity between vectors. The higher the score is, the more two vectors have in common. We then utilize these scores to do a weighted sum of all the vectors: class head(nn.Module): def __init__(self, embeds_size=32, block_size=16, head_size=8):     super().__init__()     self.key = nn.Linear(embeds_size, head_size, bias=False)     self.query = nn.Linear(embeds_size, head_size, bias=False)     self.value = nn.Linear(embeds_size, head_size, bias=False)     self.register_buffer('tril', torch.tril(torch.ones(block_size, block_size)))     self.dropout = nn.Dropout(0.1) def forward(self, x):     B,T,C = x.shape     # What am I looking for?     q = self.query(x)     # What do I have?     k = self.key(x)     # What is the representation value of me?     # Or: what's my personality in the group?     # Or: what mask do I have when I'm in a group?     v = self.value(x)     scores = q @ k.transpose(-2,-1) * (1 / math.sqrt(C)) # (B,T,head_size) @ (B,head_size,T) --> (B,T,T)     scores = scores.masked_fill(self.tril[:T, :T] == 0, float('-inf'))     scores = F.softmax(scores, dim=-1)     scores = self.dropout(scores)     out = scores @ v     return outUse three linear layers to transform the vector into key, query, and value, but with a smaller dimension (here same as head_size).Q, K, V: Q and K are for when we want to find similar tokens. We calculate the similarity between vectors with a dot product: q @ k.transpose(-2, -1). The shape of scores is (batch_size, block_size, block_size), which means we have the similarity scores between all the vectors in the block. V is used when we want to do the weighted sum. Scores: Pure dot product scores tend to have very high numbers that are not suitable for softmax since it makes the scores dense. Therefore, we rescale the results with a ratio of (1 / math.sqrt(C)). C is the embedding size. We call this a scaled dot product.Register_buffer: We used register_buffer to register a lower triangular tensor. In this way, when you save and load the model, this tensor also becomes part of the model.Masking: After calculating the scores, we need to replace future scores with -inf to shut them off so that the vectors do not have access to the future tokens. By doing so, these scores effectively become zero after applying the softmax function, resulting in a probability of zero for the future tokens. This process is referred to as masking. Here’s an example of masked scores with a block size 4:[[-0.1710, -inf, -inf, -inf], [ 0.2007, -0.0878, -inf, -inf], [-0.0405, 0.2913, 0.0445, -inf], [ 0.1328, -0.2244, 0.0796, 0.1719]]Softmax: It converts a vector into a probability distribution that sums up to 1. Here’s the scores after softmax:      [[1.0000, 0.0000, 0.0000, 0.0000],      [0.5716, 0.4284, 0.0000, 0.0000],      [0.2872, 0.4002, 0.3127, 0.0000],      [0.2712, 0.1897, 0.2571, 0.2820]]The scores of future tokens are zero; after doing a weighted sum, the future vectors become zero and the vectors receive none data from future vectors(n*0=0)Dropout: Dropout is a regularization technique. It drops some of the numbers in vectors randomly. Dropout helps the model to generalize, not memorize the dataset. We don’t want the model to memorize the Shakespeare model, right? We want it to create new texts like the dataset.Weighted sum: Weighted sum is used to combine different representations or embeddings based on their importance. The scores are calculated by measuring the relevance or similarity between each pair of vectors. The relevance scores are obtained by applying a scaled dot product between the query and key vectors, which are learned during the training process. The resulting weighted sum emphasizes the more important elements and reduces the influence of less relevant ones, allowing the model to focus on the most salient information. We dot product scores with values and the result is the outcome of self-attention.Output: since the embedding size and head size are 32 and 8 respectively, if the input shape is (batch_size, block_size, 32), the output has the shape of (batch_size, block_size, 8).Multihead self-attention“I have multiple personalities(v), tendencies and needs (q), and valuable things (k) in different spaces”. Vectors said.We transform the vectors into small dimensions, and then run self-attention on them; we did this in the previous class. In multihead self-attention, we call the head class four times, and then, concatenate the smaller vectors to have the same input shape. We call this multihead self-attention. For instance, if the shape of input data is (1, 16, 32), we transform it into four (1, 16, 8) tensors and run self-attention on these tensors. Why four times? 4 * 8 = initial shape. By using multihead self-attention and running self-attention multiple times, what we do is consider the different aspects of vectors in different spaces. That’s all!Here is the code:class multihead(nn.Module): def __init__(self, num_heads=4, head_size=8):     super().__init__()     self.multihead = nn.ModuleList([head(head_size) for _ in range(num_heads)])     self.output_linear = nn.Linear(embeds_size, embeds_size)     self.dropout = nn.Dropout(0.1) def forward(self, hidden_state):     hidden_state = torch.cat([head(hidden_state) for head in self.multihead], dim=-1)     hidden_state = self.output_linear(hidden_state)     hidden_state = self.dropout(hidden_state)     return hidden_state●    self.multihead: The variable creates four heads and we do this with nn.ModuleList.●    self.output_linear: Another transformer linear layer we apply at the end of the multihead self-attention process.●    self.dropout: Using dropout on the final results.●    hidden_state 1: Concatenating the output of heads so that we have the same shape as input. Heads transform data into different spaces with smaller dimensions, and then do the self-attention.●    hidden_state 2: After doing communication between tokens with self-attention, we use the self.output_linear projector to let the model adjust vectors further based on the gradients that flow through the layer.●    dropout: Run dropout on the output of the projection with a 10% probability of turning off values (make them zero) in the vectors.Transformer blockThere are two new techniques, including layer normalization and residual connection, that need to be explained:class transformer_block(nn.Module): def __init__(self, embeds_size=32, num_heads=8):     super().__init__()     self.head_count = embeds_size // num_heads     self.n_heads = multihead(num_heads, self.head_count)     self.ffn = nn.Sequential(         nn.Linear(embeds_size, 4 * embeds_size),         nn.ReLU(),         nn.Linear(4 * embeds_size, embeds_size),         nn.Dropout(drop_prob),     )     self.ln1 = nn.LayerNorm(embeds_size)     self.ln2 = nn.LayerNorm(embeds_size) def forward(self, hidden_state):     hidden_state = hidden_state + self.n_heads(self.ln1(hidden_state))     hidden_state = hidden_state + self.ffn(self.ln2(hidden_state))     return hidden_state self.head_count: Calculates the head size. The number of heads should be divisible by the embedding size so that we can concatenate the output of heads.self.n_heads: The multihead self-attention layer. self.ffn: This is the first time that we have non-linearity in our model. Non-linearity helps the model to capture complex relationships and patterns in the data. By introducing non-linearity through ReLU activation functions, or GLUE, the model can make a correlation for the data. As a result, it better models the intricacies of the input data. Non-linearity is like “you go to the next layer”, “You don’t go to the next layer”, or “Create y from x for the next layer”. The recommended hidden layer size is a number four times bigger than the embedding size. That’s why “4 * embeds_size”. You can also try SwiGLU as the activation function instead of ReLU.self.ln1 and self.ln2: Layer normalizers make the model more robust and they also help the model to converge faster. Layer normalization rescales the data in such a way that the mean is zero and the standard deviation is one. hidden_state 1: Normalize the vectors with self.ln1 and forward the vectors to the multihead attention. Next, we add the input to the output of multihead attention. It helps the model in two ways:○    First, the model has some information from the original vectors. ○    Second, when the model becomes deep, during backpropagation, the gradients will be weak for earlier layers and the model will converge too slowly. We recognize this effect as gradient vanishing. Adding the input helps to enrich the gradients and mitigate the gradient vanishing. We recognize it as a residual connection.hidden_state 2: Hidden_state 1 goes to a layer normalization and then to a nonlinear network. The output will be added to the hidden state with the aim of keeping gradients for all layers.The modelAll the necessary parts are ready, let us stack them up to make the full model:class transformer(nn.Module): def __init__(self):     super().__init__()     self.stack = nn.ModuleDict(dict(         tok_embs=nn.Embedding(vocab_size, embeds_size),         pos_embs=nn.Embedding(block_size, embeds_size),         dropout=nn.Dropout(drop_prob),         blocks=nn.Sequential(             transformer_block(),             transformer_block(),             transformer_block(),             transformer_block(),             transformer_block(),         ),         ln=nn.LayerNorm(embeds_size),         lm_head=nn.Linear(embeds_size, vocab_size),     ))●    self.stack: A list of all necessary layers.●    tok_embs: This is a learnable lookup table that receives a list of indices and returns their vectors.●    pos_embs: Just like tok_embs, it is also a learnable look-up table, but for positional embedding. It receives a list of positions and returns their vectors.●    dropout: Dropout layer.●    blocks: We create multiple transformer blocks sequentially.●    ln: A layer normalization.●    lm_heas: Transformer head receives a token and returns probabilities of the next token. To change the model to be a classifier, or a sentimental analysis model, we just need to change this layer and remove masking from the self-attention layer.The forward method of the transformer class:    def forward(self, seq, targets=None):     B, T = seq.shape     tok_emb = self.stack.tok_embs(seq) # (batch, block_size, embed_dim) (B,T,C)     pos_emb = self.stack.pos_embs(torch.arange(T, device=device))     x = tok_emb + pos_emb     x = self.stack.dropout(x)     x = self.stack.blocks(x)     x = self.stack.ln(x)     logits = self.stack.lm_head(x) # (B, block_size, vocab_size)     if targets is None:         loss = None     else:         B, T, C = logits.shape         logits = logits.view(B * T, C)         targets = targets.view(B * T)         loss = F.cross_entropy(logits, targets)     return logits, loss●  tok_emb: Convert token indices into vectors. Given the input (B, T), the output is (B, T, C), where C is the embeds_size.●  pos_emb: Given the number of tokens in the context window or block_size, it returns the positional embedding of each position.●  x 1: Add up token embeddings and position embeddings. A little bit lossy but it works just fine.●  x 2: Run dropout on embeddings.●  x 3: The embeddings go through all the transformer blocks, and multihead self-attention. The input is (B, T, C) and the output is (B, T, C).●  x 4: The outcome of transformer blocks goes to the layer normalization.●  logits: We usually recognize the unnormalized values extracted from the language model head as logits :)●  if-else block: Were the targets specified, we calculated cross-entropy loss. Otherwise, the loss will be None. Before calculating loss in the else block, we change the shape as the cross entropy function expects.●  Output: The method returns logits with shape (batch_size, block_size, vocab_size) and loss if any.For generating a text, add this to the transformer class:    def autocomplete(self, seq, _len=10):        for _ in range(_len):            seq_crop = seq[:, -block_size:] # crop it            logits, _ = self(seq_crop)            logits = logits[:, -1, :] # we care about the last token            probs = F.softmax(logits, dim=-1)            next_char = torch.multinomial(probs, num_samples=1)            seq = torch.cat((seq, next_char), dim=1)        return seq●  autocomplete: Given a tokenized sequence, and the number of tokens that need to be created, this method returns _len tokens.●  seq_crop: Select the last n tokens in the sequence to give it to the model. The sequence length might be larger than the block_size and it causes an error if we don’t crop it.●  logits 1: Forward the sequence into the model to receive the logits.●  logits 2: Select the last logit that will be used to select the next token.●  probs: Run the softmax on logits to get a probability distribution.●  next_char: Multinomial selects one sample from the probs. The higher the probability of a token, the higher the chance of being selected.●  seq: Add the selected character to the sequence.TrainingThe rest of the code is downstream tasks such as training loops, etc. The codes that are provided here are slightly different from the tiny-transformer repository. I trained the model with the following hyperparameters:block_size = 256 learning_rate = 9e-4 eval_interval = 300 # Every n step, we do an evaluation. iterations = 5000 # Like epochs batch_size = 64 embeds_size = 195 num_heads = 5 num_layers = 5 drop_prob = 0.15And here’s the generated text:If you need to improve the quality, increase embeds_size, num_layers, and heads.ConclusionThe article explores transformers' text generation role, detailing token preprocessing through self-attention and neural network heads. Transformers predict tokens using context length as a hyperparameter. Human context comprehension is paralleled, highlighting relevant word emergence and fading of irrelevant words for precise selection. Transformers lack human foresight and backtracking. Key components—self-attention, multihead self-attention, and transformer blocks—are explained, and supported by code snippets. Token and positional embeddings, layer normalization, and residual connections are detailed. The model's text generation is exemplified via the autocomplete method. Training parameters and text quality enhancement are addressed, showcasing transformers' potential.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc. 
Read more
  • 0
  • 0
  • 1503

article-image-exploring-token-generation-strategies
Saeed Dehqan
28 Aug 2023
8 min read
Save for later

Exploring Token Generation Strategies

Saeed Dehqan
28 Aug 2023
8 min read
IntroductionThis article discusses different methods for generating sequences of tokens using language models, specifically focusing on the context of predicting the next token in a sequence. The article explains various techniques to select the next token based on the predicted probability distribution of possible tokens.Language models predict the next token based on n previous tokens. Models try to extract information from n previous tokens as far as they can. Transformer models aggregate information from all the n previous tokens. Tokens in a sequence communicate with one another and exchange their information. At the end of the communication process, tokens are context-aware and we use them to predict their own next token. Each token separately goes to some linear/non-linear layers and the output is unnormalized logits. Then, we apply Softmax on logits to convert them into probability distributions. Each token has its own probability distribution over its next token:Exploring Methods for Token SelectionWhen we have the probability distribution of tokens, it’s time to pick one token as the next token. There are four methods for selecting the suitable token from probability distribution:●    Greedy or naive method: Simply select the token that has the highest probability from the list. This is a deterministic method.●    Beam search: It receives a parameter named beam size and based on it, the algorithm tries to use the model to predict multiple times to find a suitable sentence, not just a token. This is a deterministic method.●    Top-k sampling: Select the top k most probable tokens and shut off other tokens (make their probability -inf) and sample from top k tokens. This is a sampling method.●    Nucleus sampling: Select the top most probable tokens and shut off other tokens but with a difference that is a dynamic selection of most probable tokens. Not just a crisp k.Greedy methodThis is a simple and fast method and only needs one prediction. Just select the most probable token as the next token. Greedy methods can be efficient on arithmetic tasks. But, it tends to get stuck in a loop and repeat tokens one after another. It also kills the diversity of the model by selecting the tokens that occur frequently in the training dataset.Here’s the code that converts unnormalized logits(simply the output of the network) into probability distribution and selects the most probable next token:probs = F.softmax(logits, dim=-1) next_token = probs.argmax() Beam searchBeam search produces better results and is slower because it runs the model multiple times so that it can create n sequences, where n is beam size. This method selects top n tokens and adds them to the current sequence and runs the model on the made sequences to predict the next token. And this process continues until the end of the sequence. Computationally expensive, but more quality. Based on this search, the algorithm returns two sequences:Then, how do we select the final sequence? We sum up the loss for all predictions and select the sequence with the lowest loss.Simple samplingWe can select tokens randomly based on their probability. The more the probability, the more the chance of being selected. We can achieve this by using multinomial method:logits = logits[:, -1, :] probs = F.softmax(logits, dim=-1) next_idx = torch.multinomial(probs, num_samples=1)This is part of the model we implemented in the “transformer building blocks” blog and the code can be found here. The torch.multinomial receives the probability distribution and selects n samples. Here’s an example:In [1]: import torch In [2]: probs = torch.tensor([0.3, 0.6, 0.1]) In [3]: torch.multinomial(probs, num_samples=1) Out[3]: tensor([1]) In [4]: torch.multinomial(probs, num_samples=1) Out[4]: tensor([0]) In [5]: torch.multinomial(probs, num_samples=1) Out[5]: tensor([1]) In [6]: torch.multinomial(probs, num_samples=1) Out[6]: tensor([0]) In [7]: torch.multinomial(probs, num_samples=1) Out[7]: tensor([1]) In [8]: torch.multinomial(probs, num_samples=1) Out[8]: tensor([1])We ran the method six times on probs, and as you can see it selects 0.6 four times and 0.3 two times because 0.6 is higher than 0.3.Top-k samplingIf we want to make the previous sampling method better, we need to limit the sampling space. Top-k sampling does this. K is a parameter that Top-k sampling uses to select top k tokens from the probability distribution and sample from these k tokens. Here is an example of top-k sampling:In [1]: import torch In [2]: logit = torch.randn(10) In [3]: logit Out[3]: tensor([-1.1147, 0.5769, 0.3831, -0.5841, 1.7528, -0.7718, -0.4438, 0.6529, 0.1500, 1.2592]) In [4]: topk_values, topk_indices = torch.topk(logit, 3) In [5]: topk_values Out[5]: tensor([1.7528, 1.2592, 0.6529]) In [6]: logit[logit < topk_values[-1]] = float('-inf') In [7]: logit Out[7]: tensor([ -inf, -inf, -inf, -inf, 1.7528, -inf, -inf, 0.6529, -inf, 1.2592]) In [8]: probs = logit.softmax(0) In [9]: probs Out[9]: tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.5146, 0.0000, 0.0000, 0.1713, 0.0000, 0.3141]) In [10]: torch.multinomial(probs, num_samples=1) Out[10]: tensor([9]) In [11]: torch.multinomial(probs, num_samples=1) Out[11]: tensor([4]) In [12]: torch.multinomial(probs, num_samples=1) Out[12]: tensor([9])●    We first create a fake logit with torch.randn. Supposedly logit is the raw output of a network.●    We use torch.topk to select the top 3 values from logit. torch.topk returns top 3 values along with their indices. The values are sorted from top to bottom.●    We use advanced indexing to select logit values that are lower than the last top 3 values. When we say logit < topk_values[-1] we mean all the numbers in logit that are lower than topk_values[-1] (0.6529). ●    After selecting those numbers, we replace their value to float(‘-inf’), which is a negative infinite number. ●    After replacement, we run softmax over the logit to convert it into probabilities. ●    Now, we use torch.multinomial to sample from the probs.Nucleus samplingNucleus sampling is like Top-k sampling but with a dynamic selection of top tokens instead of selecting k tokens. The dynamic selection is better when we are unsure of selecting a suitable k for Top-k sampling. Nucleus sampling has a hyperparameter named p, let us say it is 0.9, and this method selects tokens from descending order and adds up their probabilities and when we reach a cumulative sum of p, we stop. What is the cumulative sum? Here’s an example of cumulative sum:In [1]: import torch In [2]: logit = torch.randn(10) In [3]: probs = logit.softmax(0) In [4]: probs Out[4]: tensor([0.0652, 0.0330, 0.0609, 0.0436, 0.2365, 0.1738, 0.0651, 0.0692, 0.0495, 0.2031]) In [5]: [probs[:x+1].sum() for x in range(probs.size(0))] Out[5]: [tensor(0.0652), tensor(0.0983), tensor(0.1592), tensor(0.2028), tensor(0.4394), tensor(0.6131), tensor(0.6782), tensor(0.7474), tensor(0.7969), tensor(1.)]I hope you understand how cumulative sum works from the code. We just add up n previous prob values. We can also use torch.cumsum and get the same result:In [9]: torch.cumsum(probs, dim=0) Out[9]: tensor([0.0652, 0.0983, 0.1592, 0.2028, 0.4394, 0.6131, 0.6782, 0.7474, 0.7969, 1.0000]) Okay. Here’s a nucleus sampling from scratch: In [1]: import torch In [2]: logit = torch.randn(10) In [3]: probs = logit.softmax(0) In [4]: probs Out[4]: tensor([0.7492, 0.0100, 0.0332, 0.0078, 0.0191, 0.0370, 0.0444, 0.0553, 0.0135, 0.0305]) In [5]: sprobs, indices = torch.sort(probs, dim=0, descending=True) In [6]: sprobs Out[6]: tensor([0.7492, 0.0553, 0.0444, 0.0370, 0.0332, 0.0305, 0.0191, 0.0135, 0.0100, 0.0078]) In [7]: cs_probs = torch.cumsum(sprobs, dim=0) In [8]: cs_probs Out[8]: tensor([0.7492, 0.8045, 0.8489, 0.8860, 0.9192, 0.9497, 0.9687, 0.9822, 0.9922, 1.0000]) In [9]: selected_tokens = cs_probs < 0.9 In [10]: selected_tokens Out[10]: tensor([ True, True, True, True, False, False, False, False, False, False]) In [11]: probs[indices[selected_tokens]] Out[11]: tensor([0.7492, 0.0553, 0.0444, 0.0370]) In [12]: probs = probs[indices[selected_tokens]] In [13]: torch.multinomial(probs, num_samples=1) Out[13]: tensor([0])●    Convert the logit to probabilities and sort it with descending order so that we can select them from top to bottom.●    Calculate cumulative sum.●    Using advanced indexing, we filter out values.●    Then, we sample from a limited and better space.Please note that you can use a combination of top-k and nucleus samplings. It is like selecting k tokens and doing nucleus sampling on these k tokens. You can also use top-k, nucleus, and beam search.ConclusionUnderstanding these methods is crucial for anyone working with language models, natural language processing, or text generation tasks. These techniques play a significant role in generating coherent and diverse sequences of text. Depending on the specific use case and desired outcomes, readers can choose the most appropriate method to employ. Overall, this knowledge can contribute to improving the quality of generated text and enhancing the capabilities of language models.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc. 
Read more
  • 0
  • 0
  • 2468
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-text-classification-with-transformers
Saeed Dehqan
28 Aug 2023
9 min read
Save for later

Text Classification with Transformers

Saeed Dehqan
28 Aug 2023
9 min read
IntroductionThis blog aims to implement binary text classification using a transformer architecture. If you're new to transformers, the "Transformer Building Blocks" blog explains the architecture and its text generation implementation. Beyond text generation and translation, transformers serve classification, sentiment analysis, and speech recognition. The transformer model comprises two parts: an encoder and a decoder. The encoder extracts features, while the decoder processes them. Just as a painter with tree features can draw, describe, visualize, categorize, or write about a tree, transformers encode knowledge (encoder) and apply it (decoder). This dual-part process is pivotal for text classification with transformers, allowing them to excel in diverse tasks like sentiment analysis, illustrating their transformative role in NLP.Deep Dive into Text Classification with TransformersWe train the model on the IMDB dataset. The dataset is ready and there’s no preprocessing needed. The model is vocab-based instead of character-based so that the model can converge faster. I limited the dataset vocabs to the 20000 most frequent vocabs. I also reduced the sequence to 200 so we can train faster. I tried to simplify the model and use torch.nn.MultiheadAttention it instead of writing the Multihead-attention ourselves. It makes the model faster since the nn.MultiheadAttention uses scaled_dot_product_attention under the hood. But if you want to know how MultiheadAttention works you can study the transformer building blocks blog or see the code here.Okay, now, let us add the feature extractor part:class transformer_block(nn.Module): def __init__(self):     super(block, self).__init__()     self.attention = nn.MultiheadAttention(embeds_size, num_heads, batch_first=True)     self.ffn = nn.Sequential(         nn.Linear(embeds_size, 4 * embeds_size),         nn.LeakyReLU(),         nn.Linear(4 * embeds_size, embeds_size),     )     self.drop1 = nn.Dropout(drop_prob)     self.drop2 = nn.Dropout(drop_prob)     self.ln1 = nn.LayerNorm(embeds_size, eps=1e-6)     self.ln2 = nn.LayerNorm(embeds_size, eps=1e-6) def forward(self, hidden_state):     attn, _ = self.attention(hidden_state, hidden_state, hidden_state, need_weights=False)     attn = self.drop1(attn)     out = self.ln1(hidden_state + attn)     observed = self.ffn(out)     observed = self.drop2(observed)     return self.ln2(out + observed)●    hidden_state: A tensor with a shape (batch_size, block_size, embeds_size) goes to the transformer_block and a tensor with the same shape goes out of it.●    self.attention: The transformer block tries to combine the information of tokens so that each token is aware of its neighbors or other tokens in the context. We may call this part the communication part. That’s what the nn.MultiheadAttention does. nn.MultiheadAttention is a ready multihead attention layer that can be faster than implementing it from scratch, just like what we did in the “Transformer Building Blocks” blog. The parameters of nn.MultiheadAttention are as follows:     ○    embeds_size: token embedding size     ○    num_heads: multihead, as the name suggests, consists of multiple heads and each head works on different parts of token embeddings. Suppose, your input data has shape (B,T,C) = (10, 32, 16). The token embedding size for this data is 16. If we specify the num_heads parameter to 2(divisible by 16), the multi-head splits data into two parts with shape (10, 32, 8). The first head works on the first part and the second head works on the second part. This is because transforming data into different subspaces can help the model to see different aspects of the data. Please note that the num_heads should be divisible by the embedding size so that at the end we can concatenate the split parts.    ○    batch_first: True means the first dimension is batch.●    Dropout: After the attention layer, the communication between tokens is closed and computations on tokens are done individually. We run a dropout on tokens. Dropout is a method of regularization. Regularization helps the training process to be based on generalization, not memorization. Without regularization, the model tries to memorize the training set and has poor performance on the test set. The dropout method turns off features with a probability of drop_prob.●    self.ln1: Layer normalization normalizes embeddings so that they have zero mean and standard deviation one.●    Residual connection: hidden_state + attn: Observe that before normalization, we added the input to the output of multihead attention, named residual connection. It has two benefits:   ○    It helps the model to have the unchanged embedding information.   ○    It helps to prevent gradient vanishing, which is common in deep networks where we stack multiple transformer layers.●    self.ffn: After dropout, residual connection, and normalization, we forward data into a simple non-linear neural network to adjust the tokens one by one for better representation.●    self.ln2(out + observed): Finally, another dropout, residual connection, and layer normalization.The transformer block is ready. And here is the final piece:class transformer(nn.Module): def __init__(self):     super(transformer, self).__init__()     self.tok_embs = nn.Embedding(vocab_size, embeds_size)     self.pos_embs = nn.Embedding(block_size, embeds_size)     self.block = block()     self.ln1 = nn.LayerNorm(embeds_size)     self.ln2 = nn.LayerNorm(embeds_size)     self.classifier_head = nn.Sequential(         nn.Linear(embeds_size, embeds_size),         nn.LeakyReLU(),         nn.Dropout(drop_prob),         nn.Linear(embeds_size, embeds_size),         nn.LeakyReLU(),         nn.Linear(embeds_size, num_classes),         nn.Softmax(dim=1),     )     print("number of parameters: %.2fM" % (self.num_params()/1e6,)) def num_params(self):     n_params = sum(p.numel() for p in self.parameters())     return n_params def forward(self, seq):     B,T = seq.shape     embedded = self.tok_embs(seq)     embedded = embedded + self.pos_embs(torch.arange(T, device=device))     output = self.block(embedded)     output = output.mean(dim=1)     output = self.classifier_head(output)     return output●    self.tok_embs: nn.Embedding is like a lookup table that receives a sequence of indices, and returns their corresponding embeddings. These embeddings will receive gradients so that the model can update them to make better predictions.●    self.tok_embs: To comprehend a sentence, you not only need words, you also need to have the order of words. Here, we embed positions and add them to the token embeddings. In this way, the model has both words and their order.●    self.block: In this model, we only use one transformer block, but you can stack more blocks to get better results.●    self.classifier_head: This is where we put the extracted information into action to classify the sequence. We call it the transformer head. It receives a fixed-size vector and classifies the sequence. The softmax as the final activation function returns a probability distribution for each class.●    self.tok_embs(seq): Given a sequence of indices (batch_size, block_size), it returns (batch_size, block_size, embeds_size).●    self.pos_embs(torch.arange(T, device=device)): Given a sequence of positions, i.e. [0,1,2], it returns embeddings of each position. Then, we add them to the token embeddings.●    self.block(embedded): The embedding goes to the transformer block to extract features. Given the embedded shape (batch_size, block_size, embeds_size), the output has the same shape (batch_size, block_size, embeds_size).●    output.mean(dim=1): The purpose of using mean is to aggregate the information from the sequence into a compact representation before feeding it into self.classifier_head. It helps in reducing the spatial dimensionality and extracting the most important features from the sequence. Given the input shape (batch_size, block_size, embeds_size), the output shape is (batch_size, embeds_size). So, one fixed-size vector for each batch.●    self.classifier_head(output): And here we classify.The final code can be found here. The remaining code consists of downstream tasks such as the training loop, loading the dataset, setting the hyperparameters, and optimizer. I used RMSprop instead of Adam and AdamW. I also used BCEWithLogitsLoss instead of cross-entropy loss. BCE(Binary Cross Entropy) is for binary classification models and it combines sigmoid with cross entropy and it is numerically more stable. I also empirically got better accuracy. After 30 epochs, the final accuracy is ~84%.ConclusionThis exploration of text classification using transformers reveals their revolutionary potential. Beyond text generation, transformers excel in sentiment analysis. The encoder-decoder model, analogous to a painter interpreting tree feature, propels efficient text classification. A streamlined practical approach and the meticulously crafted transformer block enhance the architecture's robustness. Through optimization methods and loss functions, the model is honed, yielding an empirically validated 84% accuracy after 30 epochs. This journey highlights transformers' disruptive impact on reshaping AI-driven language comprehension, fundamentally altering the landscape of Natural Language Processing.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc. 
Read more
  • 0
  • 0
  • 2352

article-image-designing-decoder-only-transformer-models-like-chatgpt
Saeed Dehqan
28 Aug 2023
9 min read
Save for later

Designing Decoder-only Transformer Models like ChatGPT

Saeed Dehqan
28 Aug 2023
9 min read
IntroductionEmbark on an enlightening journey into the ChatGPT stack, a remarkable feat in AI-driven language generation. Unveiling its evolution from inception to a proficient AI assistant, we delve into decoder-only transformers, specialized for crafting Shakespearean verses and informative responses.Throughout this exploration, we dissect the four integral stages that constitute the ChatGPT stack. From exhaustive pretraining to fine-tuned supervised training, we unravel how rewards and reinforcement learning refine response generation to align with context and user intent.In this blog, we will get acquainted briefly with the ChatGPT stack and then implement a simple decoder-only transformer to train on Shakespeare.Creating ChatGPT models consists of four main stages:1.    Pretraining:2.    Supervised Fine Tuning3.    Reward modeling4.    Reinforcement learningThe Pretraining stage takes most of the computational time since we train the language model on trillions of tokens. The following table shows the Data Mixtures used for pretraining of LLaMA Meta Models [0]:The datasets come and mix together, according to the sampling proportion, to create the pretraining data. The table shows the datasets along with their corresponding sampling proportion (What portion of the pre-trained data is the dataset?), epochs (How many times do we train the model on the corresponding datasets?), and dataset size. It is obvious that the epoch of high-quality datasets such as Wikipedia, and Books is high and as a result, the model grasps high-quality datasets better.After we have our dataset ready, the next step is Tokenization before training. Tokenizing data means mapping all the text data into a large list of integers. In language modeling repositories, we usually have two dictionaries for mapping tokens (a token is a sub word. Like ‘wait’, and ‘ing’ are two tokens.) into integers and vice versa. Here is an example:In [1]: text = "it is obvious that the epoch of high .." In [2]: tokens = list(set(text.split())) In [3]: stoi = {s:i for i,s in enumerate(tokens)} In [4]: itos = {i:s for s,i in stoi.items()} In [5]: stoi['it'] Out[5]: 22 In [6]: itos[22] Out[6]: 'it'Now, we can tokenize texts with the following functions:In [7]: encode = lambda text: [stoi[x] for x in text.split()] In [8]: decode = lambda encoded: ' '.join([itos[x] for x in encoded]) In [9]: tokenized = encode(text) In [10]: tokenized Out[10]: [22, 19, 18, 5, ...] In [11]: decode(tokenized) Out[11]: 'it is obvious that the epoch of high ..'Suppose the tokenized variable contains all the tokens converted to integers (say 1 billion tokens). We select 3 chunks of the list randomly that each chunk contains 10 tokens and feed-forward them into a transformer language model to predict the next token. The model’s input has a shape (3, 10), here 3 is batch size and 5 is context length. The model tries to predict the next token for each chunk independently. We select 3 chunks and predict the next token for each chunk to speed up the training process. It is like running the model on 3 chunks of data at once. You can increase the batch size and context length depending on the requirements and resources. Here’s an example:For convenience, we wrote the token indices along with the corresponding tokens. For each chunk or sequence, the model predicts the whole sequence. Let’s see how this works:By seeing the first token (it), the model predicts the next token (is). The context token(s) is ‘it’ and the target token for the model is ‘is’. If the model fails to predict the target token, we do backpropagation to adjust model parameters so the model can predict correctly.During the process, we mask out or hide the future tokens so that the model can’t have access to the future tokens. Because it is kind of cheating. We want the model itself to predict the future by only seeing the past tokens. That makes sense, right? That’s why we used a gray background for future tokens, which means the model is not able to see them.After predicting the second token, we have two tokens [it, is] as context to predict what token comes next in the sequence. Here is the third token (obvious).By using the three previous tokens [it, is, obvious], the model needs to predict the fourth token (that). And as usual, we hide the future tokens (in this case ‘the’).We give [it, is, obvious, that] to the model as the context in order to predict ‘the’. And finally, we give all the sequence as context [it, is, obvious, that, the] to predict the next token.We have five predictions for a sequence with a length of five.After training the model on a lot of randomly selected sequences from the pre-trained dataset, the model should be ready to autocomplete your sequence. Give it a sequence of tokens, and then, it predicts the next token and based on what was predicted plus previous tokens, the model predicts the next tokens one by one. We call it an autoregressive model. That’s it.But, at this stage, the model is not an AI assistant or a chatbot. It only receives a sequence and tries to complete the sequence. That’s how we trained the model. We don’t train it to answer questions and listen to the instructions. We give it context tokens and the model tries to predict the next token based on the context.You give it this:“In order to be irrational, you first need to”And the model continues the sequence:“In order to be irrational, you first need to abandon logical reasoning and disregard factual evidence.”Sometimes, you ask it an instruction:“Write a function to count from 1 to 100.”And instead of trying to write a function, the model answers with more similar instructions:“Write a program to sort an array of integers in ascending order.”“Write a script to calculate the factorial of a given number.”“Write a method to validate a user's input and ensure it meets the specified criteria.”“Write a function to check if a string is a palindrome or not.”That’s where prompt engineering came in. People tried to use some tricks to get the answer to a question out of the model.Give the model the following prompt:“London is the capital of England.Copenhagen is the capital of Denmark.Oslo is the capital of”The model answers like this:“Norway.”So, we managed to get something helpful out of it with prompt engineering. But we don’t want to provide examples every time. We want to ask it a question and receive an answer. To prepare the model to be an AI assistant, we need further training named Supervised Fine Tuning for instructional purposes.In the Supervised Fine-Tuning stage, we make the model instructional. To achieve this goal the model needs training on a high quality 15k-100K of prompt and response dataset. Here’s an example of it: { "instruction": "When was the last flight of Concorde?", "context": "", "response": "On 26 November 2003", "category": "open_qa" }This example was taken from the databricks-dolly-15k dataset that is an open-source dataset for Supervised/Instruction Fine Tuning[1]. You can download the dataset from here. Instructions have seven categorizations including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This is because we want to train the model in different tasks. For instance, the above instruction is open QA, meaning the question is a general one and does not require reasoning abilities. It teaches the model to answer general questions. Closed QA requires reasoning abilities. During Instruction fine-tuning, nothing will change algorithmically. We do the same process as the previous stage (Pretraining). We gave instructions as context tokens and we want the model to continue the sequence with response.We continue this process for thousands of examples and then, the model is ready to be instructional. But that’s not the end of the story of the model behind ChatGPT. OpenAI designed a supervised reward modeling that returns a reward for the sequences that were made by the base model for the same input prompt. They give the model a prompt and run the model four times, for instance, to have four different answers for the same prompt. The model produces different answers each time because of the sampling method they use. Then, the reward model receives the input prompt and the produced answers to get a reward score for each answer. The better the answer, the better the reward score is. The model requires ground-truth scores to be trained and these scores came from labelers who worked for OpenAI. Labelers were given prompt text and model responses and they ranked them from the best to the worst.At the final stage, the ChatGPT uses Reinforcement Learning with Human Feedback (RLHF) to generate responses that get the best scores from the rewarding model. RL is an architecture that tries to find the best way of achieving a goal. The goal can be checkmate in chess or creating the best answer for the input prompt. The RL learning process is like doing an action and getting a reward or penalty for the action. And we do not take actions that end up penalizing. RLHF is what made ChatGPT so good:The PPO-ptx shows the win rate of GPT + RLHF compared to SFT (Supervised Fine-Tuned model), GPT with prompt engineering, and GPT base.ConclusionIn summation, the ChatGPT stack exemplifies AI's potent fusion with language generation. From inception to proficient AI assistant, we've traversed core stages – pretraining, fine-tuning, and reinforcement learning. Decoder-only transformers have enlivened Shakespearean text and insights.Tokenization's role in enabling ChatGPT's prowess concludes our journey. This AI evolution showcases technology's synergy with creative text generation.ChatGPT's ascent highlights AI's potential to emulate human-like language understanding. With ongoing refinement, the future promises versatile conversational AI that bridges artificial intelligence and language's artistry, fostering human-AI understanding.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc. 
Read more
  • 0
  • 0
  • 644

article-image-generative-fill-with-adobe-firefly-part-ii
Joseph Labrecque
24 Aug 2023
9 min read
Save for later

Generative Fill with Adobe Firefly (Part II)

Joseph Labrecque
24 Aug 2023
9 min read
Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ.  Image 1: Adobe FireflyFor more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series:      Animating Adobe Firefly Content with Adobe Animate      Exploring Text to Image with Adobe Firefly      Generating Text Effects with Adobe Firefly       Adobe Firefly Feature Deep Dive       Generative Fill with Adobe Firefly (Part I)This is the conclusion of a two-part article. You can catch up by reading Generative Fill with Adobe Firefly (Part I). In this article, we’ll continue our exploration of Firefly with the Generative fill module by looking at how to use the Insert and Replace features… and more.Generative Fill – Part I RecapIn part I of our Firefly Generative fill exploration, we uploaded a photograph of a cat, Poe, to the AI and began working with the various tools to remove the background and replace it with prompt-based generative AI content.Image 2: The original photograph of PoeNote that the original photograph includes a set of electric outlets exposed within the wall. When we remove the background, Firefly recognizes that these objects are distinct from the general background and so retains them.Image 3: A set of backgrounds is generated for us to choose fromYou can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph.Again, if you’d like to view these processes in detail – check out Generative Fill with Adobe Firefly (Part I).Insert and Replace with Generative FillWe covered generating a background for our image in part I of this article. Now we will focus on other aspects of Firefly Generative fill, including the Remove and Insert tools.Consider the image above and note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background in part I, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background… but we should remove them.This is where the Remove tool comes into play.Image 4: The Remove toolSwitching to the Remove tool will allow you to brush over an area of the photograph you’d like to remove. It fills in the removed area with pixels generated by the AI to create seamless removal.1.               Select the Remove tool now. Note that when switching between the Insert and Remove tools, you will often encounter a save prompt as seen below. If there are no changes to save, this prompt will not appear!Image 5: When you switch tools… you may be asked to save your work2.               Simply click the Save button to continue – as choosing the Cancel button will halt the tool selection.3.               With the Remove tool selected, you can adjust the Brush Settings from the toolbar below the image, at the bottom of the screen.Image 6: The Brush Settings overlay4.               Zoom in closer to the wall outlet and brush over the area by clicking and dragging with your mouse. The size of your brush, depending upon brush settings, will appear as a circular outline. You can change the size of the brush by tapping the [ or] keys on your keyboard.Image 7: Brushing over the wall outlet with the Remove tool5.               Once you are happy with the selection you’ve made, click the Remove button within the toolbar at the bottom of the screen.Image 8: The Remove button appears within the toolbar6.               The Firefly AI uses Generative fill to replace the brushed-over area with new content based upon the surrounding pixels. A set of four variations appears below the photograph. Click on each one to preview – as they can vary quite a bit.Image 9: Selecting a fill variant7.               Klick the Keep button in the toolbar to save your selection and continue editing. Remember – if you attempt to switch tools before saving… Firefly will prompt you to save your edits via a small overlay prompt.The outlet has now been removed and the wall is all patched up.Aside from the removal of objects through Generative fill, we can also perform insertions based on text prompts. Let’s add some additional elements to our photograph using these methods.  1.               Select the Insert tool from the left-hand toolbar.2.               Use it in a similar way as we did the Remove tool to brush in a selection of the image. In this case, we’ll add a crown to Poe’s head – so brush in an area that contains the top of his head and some space above it. Try and visualize a crown shape as you do this.3.               In the prompt input that appears beneath the photograph, type in a descriptive text prompt similar to the following: “regal crown with many jewels”Image 10: A selection is made, and a text prompt inserted4.               Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 11: Poe is a regal cat5.               A crown is generated in accordance with our text prompt and the surrounding area. A set of four variations to choose from appears as well. Note how integrated they appear against the original photographic content.6.               Click the Keep button to commit and save your crown selection.7.               Let’s add a scepter as well. Brush the general form of a scepter across Poe’s body extending from his paws to his shoulder.8.               Type in the text prompt: “royal scepter”Image 12: Brushing in a scepter shape9.               Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 13: Poe now holds a regal scepter in addition to his crown10.            Remember to choose a scepter variant and click the Keep button to commit and save your scepter selection.Okay! That should be enough regalia to satisfy Poe. Let’s download our creation for distribution or use in other software.Downloading your ImageClick the Download button in the upper right of the screen to begin the download process for your image.Image 14: The Download buttonAs Firefly begins preparing the image for download, a small overlay dialog appears.Image 15: Content credentials are applied to the image as it is downloadedFirefly applies metadata to any generated image in the form of content credentials and the image download process begins.Once the image is downloaded, it can be viewed and shared just like any other image file.Image 16: The final image from our exploration of Generative fillAlong with content credentials, a small badge is placed upon the lower right of the image which visually identifies the image as having been produced with Adobe Firefly.That concludes our set of articles on using Generative fill to remove and insert objects into your images using the Adobe Firefly AI. We have a number of additional articles on Firefly procedures on the way… including Generative recolor for vector artwork!Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC, a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023
Read more
  • 0
  • 0
  • 667

article-image-generative-fill-with-adobe-firefly-part-i
Joseph Labrecque
24 Aug 2023
8 min read
Save for later

Generative Fill with Adobe Firefly (Part I)

Joseph Labrecque
24 Aug 2023
8 min read
Adobe Firefly AI Overview Adobe Firefly is a new set of generative AI tools that can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ.    Image 1: Adobe Firefly For more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series:  Animating Adobe Firefly Content with Adobe Animate  Exploring Text to Image with Adobe Firefly  Generating Text Effects with Adobe Firefly  Adobe Firefly Feature Deep Dive In the next two articles, we’ll continue our exploration of Firefly with the Generative fill module. We’ll begin with an overview of accessing Generative fill from a generated image and then explore how to use the module on our own personal images.  Recall from a previous article Exploring Text to Image with Adobe Firefly that when you hover your mouse cursor over a generated image – overlay controls will appear.  Image 2: Generative fill overlay control from Text to image  One of the controls in the upper right of the image frame will invoke the Generative fill module and pass the generated image into that view.   Image 3: The generated image is sent to the Generative fill module Within the Generative fill module, you can use any of the tools and workflows that are available when invoking Generative fill from the Firefly website. The only difference is that you are passing in a generated image rather than uploading an image from your local hard drive.  Keep this in mind as we continue to explore the basics of Generative fill in Firefly – as we’ll begin the process from scratch. Generative Fill When you first enter the Firefly web experience, you will be presented with the various workflows available.  These appear as UI cards and present a sample image, the name of the procedure, a procedure description, and either a button to begin the process or a label stating that it is “in exploration”. Those which are in exploration are not yet available to general users. We want to locate the Generative fill module and click the Generate button to enter the experience.   Image 4: The Generative fill module card From there, you’ll be taken to a view that prompts you to upload an image into the module. Firefly also presents a set of sample images you can load into the experience.    Image 5: Generative fill getting started promptly Clicking the Upload image button summons a file browser for you to locate the file you want to use Generative fill on. In my example, I’ll be using a photograph of my cat, Poe. You can download the photograph of Poe [[ NOTE – LINK TO FILE Poe.jpg ]] to work with as well.   Image 6: The photograph of Poe, a cat Once the image file has been uploaded into Firefly, you will be taken to the Generative fill user experience and the photograph will be visible. Note that this is exactly the same experience as when entering Generative fill from a prompt-generated image as we saw above. The only real difference is how we get to this point.   Image 7: The photograph is loaded into Generative fill You will note that there are two sets of tools available within the experience. One set is along the left side of the screen and includes Insert, Remove, and Pan tools.   Image 8: Insert, Remove, and Pan Switching between the Insert and Remove tools changes the function of the current process. The Pan tool allows you to pan the image around the view.  Along the bottom of the screen is the second set of tools – which are focused on selections. This set contains the Add and Subtract tools, access to Brush Settings, a Background removal process, and a selection Invert toggle.   Image 9: Add, Subtract, Brush Settings, Background removal, and selection Invert Let’s perform some Generative fill work on the photograph of Poe.  In the larger overlay along the bottom of the view, locate and click the Background option. This is an automated process that will detect and remove the background from the image loaded into Firefly.   Image 10: The background is removed from the selected photograph 2. A prompt input appears directly beneath the photograph. Type in the following prompt: “a quiet jungle at night with lots of mist and moonlight”  Image 11: Entering a prompt into the prompt input control 3. If desired, you can view and adjust the settings for the generative AI by clicking the Settings icon in the prompt input control. This summons the Settings overlay.  Image 12: The generative AI Settings overlay Within the Settings overlay, you will find there are three items that can be adjusted to influence the AI:  Match shape: You have two choices here – freeform or conform.  Preserve content: A slider that can be set to include more of the original content or produce new content. Guidance strength: A slider that can be set to provide more strength to the original image or the given prompt. I suggest leaving these at the default setting for now. 4. Click the Settings icon again to dismiss the overlay. 5. Click the Generate button to generate a background based upon the entered prompt. A new background is generated from our prompt, and it now appears as though Poe is visiting a lush jungle at night.   Image 13: Poe enjoying the jungle at night Note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background and has interestingly propped them up with a couple of sticks. It also has gone through and rendered a realistic shadow cast by Poe.  Before moving on, click the Cancel button to bring the transparent background back. Clicking the Keep button will commit the changes – and we do not want that as we wish to continue exploring other options. Clear out the prompt you previously wrote within the prompt input control so that there is no longer any prompt present.   Image 14: Click the Generate button with no prompt present 3. Click the Generate button without a text prompt in place. The photograph receives a different background from the one generated with a text prompt. When clicking the Generate button with no text prompt, you are basically allowing the Firefly AI to make all the decisions based solely on the visual properties of the image.   Image 15: A set of backgrounds is generated based on the remaining pixels present You can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph. If you’d like Firefly to generate more variations – click the More button. Select the one you like best and click the Keep button. Okay! That’s pretty good but we are not done with Generative fill yet. We haven’t even touched the Insert and Remove functions… and there are Brush Settings to manipulate… and much more. In the next article, we’ll explore the remaining Generative fill tools and options to further manipulate the photograph of Poe.  Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023
Read more
  • 0
  • 0
  • 791
article-image-adobe-firefly-feature-deep-dive
Joseph Labrecque
23 Aug 2023
9 min read
Save for later

Adobe Firefly Feature Deep Dive

Joseph Labrecque
23 Aug 2023
9 min read
Adobe FireflyAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ.  Image 1: Adobe FireflyFor more information about Firefly, have a look at the previous articles in this series:       Animating Adobe Firefly Content with Adobe Animate       Exploring Text to Image with Adobe Firefly       Generating Text Effects with Adobe FireflyIn this article, we’ll be exploring some of the more detailed features of Firefly in general. While we will be doing so from the perspective of the text-to-image module, much of what we cover will be applicable to other modules and procedures as well.Before moving on to the visual controls and options… let’s consider accessibility. Here is what Adobe has to say about accessibility within Firefly:Firefly is committed to providing accessible and inclusive features to all individuals, including users working with assistive devices such as speech recognition software and screen readers. Firefly is continuously enhanced to strive to meet the needs of all types of users, including individuals with visual, hearing, cognitive, motor, or other impairments, and is designed to conform to worldwide accessibility standards. -- AdobeYou can use the following keyboard shortcuts across the Firefly interface to navigate and control the software in a non-visual way:       Tab: navigates between user interface controls.       Space/Enter: activates buttons.       Enter: activates links.       Arrow Keys: navigates between options.       Space: selects options.As with most accessibility concerns and practices, these additional controls within Firefly can benefit those users who are not otherwise impaired as well – similar to sight-enabled users making use of captions when watching video-based content.For our exploration of the various additional controls and options within Firefly, we’ll start off with a generated set of images based on a prompt. To review how to achieve this, have a look at the article “Exploring Text to Image with Adobe Firefly”.Choose one of the generated images to work with and hover your mouse across the image to reveal a set of controls.Image 2: Image Overlay OptionsWe will explore each of these options one by one as we continue along with this article.Rating and Feedback OptionsAdobe is very open to feedback with Firefly. One reason is to get general user feedback to improve the experience of using the product… and the other is to influence the generative models so that users receive the output that is expected.Giving a simple thumbs-up or thumbs-down is the most basic level of feedback and is meant to rate the results of your prompt.Image 3: Rating the generated resultsOnce you provide a thumbs-up or thumbs-down… the overlay changes to request additional feedback. You don’t necessarily need to provide more feedback – but clicking on the Feedback button will allow you to go more in-depth in terms of why you provided the initial rating.Image 4: Additional feedback promptClicking the Feedback button will summon a much larger overlay where you can make choices via a checkbox as to why you rated the results the way you did. You also have the option to put a little note in here as well.Image 5: Additional feedback formClicking the Submit Feedback button or the Cancel button will close the overlay and bring you back to the experience.Additionally, there is an option to Report the image to Adobe. This is always a negative action – meaning that you find the results offensive or inappropriate in some way.Image 6: Report promptClicking on the Report option will summon a similar form to that of additional feedback, but the options will, of course, be different.Image 7: Report feedback formHere, you can report via a checkbox and add an optional note as part of the report. Adobe has committed to making sure that violence and things like copyrighted or trademarked characters and such are not generated by Firefly.For instance, if you use a prompt such as “Micky Mouse murdering a construction worker with a chainsaw”… you will receive a message like the following:Image 8: Firefly will not render trademarked characters or violenceWith Adobe is being massively careful in filtering certain words right now… I do hope in the future that users will be able to selectively choose exclusions in place of a general list of censored terms as exists now. While the prompt above is meant to be absurd – there are legitimate artistic reasons for many of the word categories which are currently banned.General Image ControlsThe controls in this section include some of the most used in Firefly at the moment – including the ability to download your generated image.Image 9: Image optionsWe have the following controls exposed, from left to right they are named:       Options       Download       FavoriteOptionsStarting at the left-hand side of this group of controls, we begin with an ellipse that represents Options which, when clicked, will summon a small overlay with additional choices.Image 10: Expanded optionsThe menu that appears includes the following items:1.     Submit to Firefly gallery2.     Use as a reference image3.     Copy to the clipboardLet’s examine each of these in detail.You may have noticed that the main navigation of the Firefly website includes a number of options: Home, Gallery, Favorites, About, and FAQ. The Gallery section contains generated images that users have submitted to be featured on this page.Clicking the Submit to Firefly gallery option will summon a submission overlay through which you can request that your image is included in the Gallery.Image 11: Firefly Gallery submissionSimply read over the details and click Continue or Cancel to return.The second item, Use as reference image, brings up a small overlay that includes the selected image to use as a reference along with a strength slider.Image 12: Reference image sliderMoving the slider to the left will favor the reference image and moving it to the right will favor the raw prompt instead. You must click the Generate button after adjusting the slider to see its effect.The final option is Copy to clipboard – which does exactly as you’d expect. Note that Content Credentials are applied in this case just the same as they are when downloading an image. You can read more about this feature in the Firefly FAQ.DownloadBack up to the set of three controls, the middle option allows you to initiate a Download of the selected image. As Firefly begins preparing the image for download, a small overlay dialog appears.Image 13: Download applies content credentials – similar to the Copy to clipboard optionFirefly applies metadata to any generated image in the form of content credentials and the image download process begins. We’ve covered exactly what this means in previous articles. The image is then downloaded to your local file system.FavoriteClicking the Favorite control will add the generated image to your Firefly Favorites so that you can return to the generated set of images for further manipulation or to download later on.Image 14: Adding a favoriteThe Favorite control works as a toggle. Once you declare a favorite, the heart icon will appear filled and the control will allow you to un-favorite the selected image instead.That covers the main set of controls which overlay the right of your image – but there is a smaller set of controls on the left that we must explore as well.Additional Manipulation OptionsThe alternative set of controls numbers only two – but they are both very powerful. To the left is the Show similar control and to the right is Generative fill.Image 15: Show similar and Generative fill controlsClicking upon the Show similar control will retain the particular, chosen image while regenerating the other three to be more in conformity with the image specified.Image 16: Show similar will refresh the other three imagesAs you can see when comparing the sets of images in the figures above and below… you can have great influence over your set of generated images through this control.Image 17: The original image stays the sameThe final control we will examine in this article is Generative fill. It is located right next to the Show similar control.The generative fill view presents us with a separate view and a number of all-new tools for making selections in order to add or remove content from our images.Image 18: Generative fill brings you to a different view altogetherGenerative fill is actually its own proper procedure in Adobe Firefly… and we’ll explore how to use this feature in full - in the next article! Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023 
Read more
  • 0
  • 0
  • 238

article-image-generative-recolor-with-adobe-firefly
Joseph Labrecque
23 Aug 2023
10 min read
Save for later

Generative Recolor with Adobe Firefly

Joseph Labrecque
23 Aug 2023
10 min read
Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ:Image 1: Adobe FireflyFor more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series:     Animating Adobe Firefly Content with Adobe Animate       Exploring Text to Image with Adobe Firefly      Generating Text Effects with Adobe Firefly       Adobe Firefly Feature Deep Dive       Generative Fill with Adobe Firefly (Part I)      Generative Fill with Adobe Firefly (Part II)This current Firefly article will focus on a unique use of AI prompts via the Generative recolor module.Generative Recolor and SVGWhile most procedures in Firefly are focused on generating imagery through text prompts, the service also includes modules that use prompt-driven AI a bit differently. The subject of this article, Generative recolor, is a perfect example of this.Generative recolor works with vector artwork in the form of SVG files. If you are unfamiliar with SVG, it stands for Scalable Vector Graphics and is an XML… so uses text-based nodes similar to HTML:Image 2: An SVG file is composed of vector information defining points, paths, and colorsAs the name indicates, we are dealing with vector graphics here and not photographic pixel-based bitmap images. Vectors are often used for artwork, logos, and such – as they can be infinitely scaled and easily recolored.One of the best ways of generating SVG files is by designing them in a vector-based design tool like Adobe Illustrator. Once you have finished designing your artwork, you’ll save it as SVG for use in Firefly:Image 3: Cat artwork designed in Adobe IllustratorTo convert your Illustrator artwork to SVG, perform the following steps:1.     Choose File > Save As to open the save as dialog.2.     Choose SVG (svg) for the file format:Image 3: Selecting SVG (svg) as the file format3.     Browse to the location on your computer you would like to save the file to.4.     Click the Save button.You now have an SVG file ready to recolor within Firefly. If you desire, you can download the provided cat.svg file that we will work on in this article. Recolor Vector Artwork with Generative RecolorGenerative recolor, like all Firefly modules, can be found directly at https://firefly.adobe.com/ so long as you are logged in with your Adobe ID.From the main Firefly page, you will find a number of modules for different AI-driven tasks:Image 4: Locate the Generative recolor Firefly moduleLet’s explore Generative recolor in Firefly:1.     You’ll want to locate the module named Generative recolor.2.     Click the Generate button to get started.You are taken to an intermediate view where you are able to upload your chosen SVG file for the purposes of vector recolor based upon a descriptive text prompt:Image 5: The Upload SVG button prompt appears, along with sample files3.     Click the Upload SVG button and choose cat.svg from your file system. Of course, you can use any SVG file you want if you have another in mind. If you do not have an SVG file you’d like to use, you can click on any of the samples presented below the Upload SVG button to load one up into the module.The SVG is uploaded and a control appears which displays a preview of your file along with a text input where you can write a short text prompt describing the color palette you’d like to generate:Image 6: The Generative recolor input requests a text prompt4.     Think of some descriptive words for an interesting color palette and type them into the text input. I’ll input the following simple prompt for this demonstration: “northern lights”.5.     Click the Generate button when ready.You are taken into the primary Generative recolor user interface experience and a set of four color variants is immediately available for preview:Image 7: The Firefly Generative recolor user interfaceThe interface appears similar to what you might have seen in other Firefly modules – but there are some key differences here, since we are dealing with recoloring vector artwork.The larger, left-most area contains a set of four recolor variants to choose from. Below this is the prompt input area which displays the current text prompt and a Refresh button that allows the generation of additional variants when the prompt is updated. To the right of this area are presented various additional options within a clean user interface that scrolls vertically. Let’s explore these from top to bottom.The first thing you’ll see is a thumbnail of your original artwork with the ability to replace the present SVG with a new file:Image 8: You can replace your artwork by uploading a new SVG fileDirectly below this, you will find a set of sample prompts that can be applied to your artwork:Image 9: Sample prompts can provide immediate resultsClicking upon any of these thumbnails will immediately overwrite the existing prompt and cause a refresh – generating a new set of four recolor variants.Next, is a dropdown selection which allows the choice of color harmony:Image 10: A number of color harmonies are availableChoosing to align the recolor prompt with a color harmony will impact which colors are being used based off a combination of the raw prompt – guided by harmonization rules. An indicator will be added along with the text prompt.For more information about color and color harmonies, check out Understanding color: A visual guide – from Adobe.Below is a set of eighteen color swatches to choose from:Image 11: Color chips can add bias to your text promptClicking on any of these swatches will add that color to the options below your text prompt to help guide the recolor process. You can select one or many of these swatches to use.Finally, at the very bottom of this area is a toggle switch that allows you to either preserve black and white colors in your artwork or to recolor them just like any other color:Image 12: You can choose to preserve black and white during a recolor session or notThat is everything along the right-hand side of the interface. We’ll return to this area shortly – but for now… let’s see the options that appear when hovering the mouse cursor over any of the four recolor variants:Image 13: The Generative recolor overlayHovering over a recolor variant will reveal a number of options:       Prominent colors: Displays the colors used in this recolor variant.       Shuffle colors: Will use the same colors… but distribute them differently across the vector artwork.       Options: Copy to clipboard is the only option that is available via this menu.       Download: Enables the download of this particular recolor variant.       Rate this result: Provide a positive or negative rating of this result.We’ll make use of the Download option in a bit – but first… let’s make use of some of the choices present in the right side panel to modify and guide our recolor.Modifying the PromptYou can always change the text prompt however you wish and click the Refresh button to generate a different set of variants. Let’s instead keep this same text prompt but see how various choices can impact how it affects the recolor results:Image 14: A modified prompt box with options addedFocus again on the right side of the user interface and make the following selections:1.     Select a color harmony: Complementary2.     Choose a couple of colors to weight the prompt: Green and Blue violet3.     Disable the Preserve black and white toggle4.     Click the Refresh button to see the results of these optionsA new set of four recolor variants is produced. This set of variants is guided by the extra choices we made and is vastly different from the original set which was recolored solely based upon the text prompt:Image 15: A new set of recolor variations is generatedPlay with the various options on your own to see what kind of variations you can achieve in the artwork.Downloading your Recolored ArtworkOnce you are happy with one of the generated recolored variants, you’ll want to download it for use elsewhere. Click the Download button in the upper right of the selected variant to begin the download process for your recolored SVG file.The recolored SVG file is immediately downloaded to your computer. Note that unlike other content generated with Firefly, files created with Generative recolor do not contain a Firefly watermark or badge:Image 17: The resulting recolored SVG fileThat’s all there is to it! You can continue creating more recolor variants and freely download any that you find particularly interesting.Before we conclude… note that another good use for Generative recolor – similar to most applications of AI – is for ideation. If you are stuck with a creative block when trying to decide on a color palette for something you are designing… Firefly can help kick-start that process for you.Author BioJoseph is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph Labrecque is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023 
Read more
  • 0
  • 0
  • 375

article-image-adobe-firefly-integrations-in-illustrator-and-photoshop
Joseph Labrecque
23 Aug 2023
12 min read
Save for later

Adobe Firefly Integrations in Illustrator and Photoshop

Joseph Labrecque
23 Aug 2023
12 min read
Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ:Image 1: Adobe FireflyFor more information around the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series:       Animating Adobe Firefly Content with Adobe Animate       Exploring Text to Image with Adobe Firefly       Generating Text Effects with Adobe Firefly       Adobe Firefly Feature Deep Dive      Generative Fill with Adobe Firefly (Part I)      Generative Fill with Adobe Firefly (Part II)       Generative Recolor with Adobe Firefly       Adobe Firefly and Express (beta) IntegrationThis current Firefly article will focus on Firefly integrations within the release version of Adobe Illustrator and the public beta version of Adobe Photoshop.Firefly in Adobe IllustratorVersion 27.7 is the most current release of Illustrator at the writing of this article and this version contains Firefly integrations in the form of Generative Recolor (Beta).To access this, design any vector artwork within Illustrator or open existing artwork to get started. I’m using the cat.ai file that was used to generate the cat.svg file used in the Generative Recolor with Adobe Firefly article:Image 2: The cat vector artwork with original colors1.     Select the artwork you would like to recolor. Artwork must be selected for this to work.2.     Look to the Properties panel and locate the Quick Actions at the bottom of the panel. Click the Recolor quick action:Image 3: Choosing the Recolor Quick action3.     By default, the Recolor overlay will open with the Recolor tab active. Switch to the Generative Recolor (Beta) tab to activate it instead:Image 4: The Generative Recolor (Beta) view4.     You are invited to enter a prompt. I’ve written “northern lights green and vivid neon” as my prompt that describes colors I’d like to see. There are also sample prompts you can click on below the prompt input box.5.     Click the Generate button once a prompt has been entered:Image 5: Selecting a Recolor variantA set of recolor variants is presented within the overlay. Clicking on any of these will recolor your existing artwork according to the variant look:Image 6: Adding a specific color swatchIf you would like to provide even more guidance, you can modify the prompt and even add specific color swatches you’d like to see included in the recolored artwork.That’s it for Illustrator – very straightforward and easy to use!Firefly in Adobe Photoshop (beta)Generative Fill through Firefly is also making its way into Photoshop. While within Illustrator – we have Firefly as part of the current version of the software, albeit with a beta label on the feature, with Photoshop things are a bit different:Image 7: Generative Fill is only available in the Photoshop public betaTo make use of Firefly within Photoshop, the current release version will not cut it. You will need to install the public beta from the Creative Cloud Desktop application in order to access these features.With that in mind, let’s use Generative Fill in the Photoshop public beta to expand a photograph beyond its bounds and add in additional objects.1.     First, open a photograph in the Photoshop public beta. I’m using the Poe.jpg photograph that we previously used in the articles Generative Fill with Adobe Firefly (Parts I & II):Image 8: The original photograph in Photoshop2.     With the photograph open, we’ll add some extra space to the canvas to generate additional content and expand the image beyond its bounds. Summon the Canvas Size dialog by choosing Image > Canvas Size… from the application menu.3.     Change both the width and height values to 200 Percent:Image 9: Expanding the size of the canvas4.     Click the OK button to close the dialog and apply the change.The original canvas is expanded to 200 percent of its original size while the image itself remains exactly the same:Image 10: The photograph with an expanded canvasGenerative Fill, when used in this manner to expand an image, works best by selecting portions to expand bit by bit rather than all the expanded areas at once. It is also beneficial to select parts of the original image you want to expand from. This feeds and directs the Firefly AI.5.     Using the Rectangular Marquee tool, make such a selection across either the top, bottom, left, or right portions of the document:Image 11: Making a selection for Generative Fill6.     With a selection established, click Generative Fill within the contextual toolbar:Image 12: Leaving the prompt blank allows Photoshop to make all the decisions7.     The contextual toolbar will now display a text input where you can enter a prompt to guide the process. However, in this case, we want to simply expand the image based upon the original pixels selected – so we will leave this blank with no prompt whatsoever. Click Generate to continue.8.     The AI processes the image and displays a set of variants to choose from within the Properties panel. Click the one that conforms closest to the imagery you are looking to produce and that is what will be used upon the canvas:Image 13: Choosing a Generative Fill variantNote that if you look to the Layers panel, you will find a new layer type has been created and added to the document layer stack:Image 14: Generative Layers are a new layer type in PhotoshopThe Generative Layer retains both the given prompt and variants so that you can continue to make changes and adjustments as needed – even following this specific point in time.The resulting expansion of the original image as performed by Generative Fill can be very convincing! As mentioned before, this often works best by performing fills in a piece-by-piece patchwork manner:Image 15: The photograph with a variant applied across the selectionContinue selecting portions of the image using the Rectangular Marquee tool (or any selection tools, really) and generate new content the same way we have done so already – without supplying any text prompt to the AI:Image 16: The photograph with all expanded areas filled via generative AIEventually, you will complete the expansion of the original image and produce a very convincing deception.Of course, you can also guide the AI with actual text prompts. Let’s add in an object to the image as a demonstration.1.     Using the Lasso tool (or again… any selection tool), make a selection across the image in the form of what might hold a standing lamp of some sort:Image 17: Making an additional selection2.     With a selection established, click Generative Fill within the contextual toolbar.3.     Type in a prompt that describes the object you want to generate. I will use the prompt “tall rustic wooden and metal lamp”.4.     Click the Generate button to process the Generative Fill request:Image 18: A lamp is generated from our selection and text promptA set of generated lamp variants are established within the Properties panel. Choose the one you like the most and it will be applied within the image.You will want to be careful with how many Generative Layers are produced as you work on any single document. Keep an eye on the Layers panel as you work:Image 19: Each Generative Fill process produces a new layerEach time you use Generative Fill within Photoshop, a new Generative Layer is produced.Depending upon the resources and capabilities of your computer… this might become burdensome as everything becomes more and more complex. You can always flatten your layers to a single pixel layer if this occurs to free up additional resources.That concludes our overview of Generative Fill in the Photoshop public beta!Ethical Concerns with Generative AII want to make one additional note before concluding this series and that has to do with the ethics of generative AI. This concern goes beyond Adobe Firefly specifically – as it could be argued that Firefly is the least problematic and most ethical implementation of generative AI that is available today.See https://firefly.adobe.com/faq for additional details on steps Adobe has taken to ensure responsible AI through their use of Adobe Stock content to train their models, through the use of Content Credentials, and more...Like all our AI capabilities, Firefly is developed and deployed around our AI ethics principles of accountability, responsibility, and transparency.Data collection: We train our model by collecting diverse image datasets, which have been curated and preprocessed to mitigate against harmful or biased content. We also recognize and respect artists’ ownership and intellectual property rights. This helps us build datasets that are diverse, ethical, and respectful toward our customers and our community.Addressing bias and testing for safety and harm: It’s important to us to create a model that respects our customers and aligns with our company values. In addition to training on inclusive datasets, we continually test our model to mitigate against perpetuating harmful stereotypes. We use a range of techniques, including ongoing automated testing and human evaluation.Regular updates and improvements: This is an ongoing process. We will regularly update Firefly to improve its performance and mitigate harmful bias in its output. We also provide feedback mechanisms for our users to report potentially biased outputs or provide suggestions into our testing and development processes. We are committed to working together with our customers to continue to make our model better.-- AdobeI have had discussions with a number of fellow educators about the ethical use of generative AI and Firefly in general. Here are some paraphrased takeaways to consider as we conclude this article series:      “We must train the new generations in the respect and proper use of images or all kinds of creative work.”      “I don't think Ai can capture that sensitive world that we carry as human beings.”      “As dire as some aspects of all of this are, I see opportunities.”      “Thousands of working artists had their life's work unknowingly used to create these images.”       “Professionals will be challenged, truly, by all of this, but somewhere in that process I believe we will find our space.”      “AI data expropriations are a form of digital colonialism.”      “For many students, the notion of developing genuine skill seems pointless now.”     “Even for masters of the craft, it’s dispiriting to see someone type 10 words and get something akin to what took them 10 years.”I’ve been using generative AI for a few years now and can appreciate and understand the concerns expressed above - but also recognize that this technology is not going away. We must do what we can to address the ethical concerns brought up here and make sure to use our awareness of these problematic issues to further guide the direction of these technologies as we rapidly advance forward. These are very challenging times, right now. Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph Labrecque is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023 
Read more
  • 0
  • 0
  • 773
article-image-getting-started-with-aws-codewhisperer
Rohan Chikorde
23 Aug 2023
11 min read
Save for later

Getting Started with AWS CodeWhisperer

Rohan Chikorde
23 Aug 2023
11 min read
IntroductionEfficiently writing secure, high-quality code within tight deadlines remains a constant challenge in today's fast-paced software development landscape. Developers often face repetitive tasks, code snippet searches, and the need to adhere to best practices across various programming languages and frameworks. However, AWS CodeWhisperer, an innovative AI-powered coding companion, aims to transform the way developers work. In this blog, we will explore the extensive features, benefits, and setup process of AWS CodeWhisperer, providing detailed insights and examples for technical professionals.At its core, CodeWhisperer leverages machine learning and natural language processing to deliver real-time code suggestions and streamline the development workflow. Seamlessly integrated with popular IDEs such as Visual Studio Code, IntelliJ IDEA, and AWS Cloud9, CodeWhisperer enables developers to remain focused and productive within their preferred coding environment. By eliminating the need to switch between tools and external resources, CodeWhisperer accelerates coding tasks and enhances overall productivity.A standout feature of CodeWhisperer is its ability to generate code from natural language comments. Developers can now write plain English comments describing a specific task, and CodeWhisperer automatically analyses the comment, identifies relevant cloud services and libraries, and generates code snippets directly within the IDE. This not only saves time but also allows developers to concentrate on solving business problems rather than getting entangled in mundane coding tasks.In addition to code generation, CodeWhisperer offers advanced features such as real-time code completion, intelligent refactoring suggestions, and error detection. By analyzing code patterns, industry best practices, and a vast code repository, CodeWhisperer provides contextually relevant and intelligent suggestions. Its versatility extends to multiple programming languages, including Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala, making it a valuable tool for developers across various language stacks.AWS CodeWhisperer addresses the need for developer productivity tools by streamlining the coding process and enhancing efficiency. With its AI-driven capabilities, CodeWhisperer empowers developers to write clean, efficient, and high-quality code. By supporting a wide range of programming languages and integrating with popular IDEs, CodeWhisperer caters to diverse development scenarios and enables developers to unlock their full potential. Embrace the power of AWS CodeWhisperer and experience a new level of productivity and coding efficiency in your development journey.Key Features and Benefits of CodeWhisperer A. Real-time code suggestions and completionCodeWhisperer provides developers with real-time code suggestions and completion, significantly enhancing their coding experience. As developers write code, CodeWhisperer's AI-powered engine analyzes the context and provides intelligent suggestions for function names, variable declarations, method invocations, and more. This feature helps developers write code faster, with fewer errors, and improves overall code quality. By eliminating the need to constantly refer to documentation or search for code examples, CodeWhisperer streamlines the coding process and boosts productivity.B. Intelligent code generation from natural language commentsOne of the standout features of CodeWhisperer is its ability to generate code snippets from natural language comments. Developers can simply write plain English comments describing a specific task, and CodeWhisperer automatically understands the intent and generates the corresponding code. This powerful capability saves developers time and effort, as they can focus on articulating their requirements in natural language rather than diving into the details of code implementation. With CodeWhisperer, developers can easily translate their high-level concepts into working code, making the development process more intuitive and efficient.C. Streamlining routine or time-consuming tasksCodeWhisperer excels at automating routine or time-consuming tasks that developers often encounter during the development process. From file manipulation and data processing to API integrations and unit test creation, CodeWhisperer provides ready-to-use code snippets that accelerate these tasks. By leveraging CodeWhisperer's automated code generation capabilities, developers can focus on higher-level problem-solving and innovation, rather than getting caught up in repetitive coding tasks. This streamlining of routine tasks allows developers to work more efficiently and deliver results faster.D. Leveraging AWS APIs and best practicesAs an AWS service, CodeWhisperer is specifically designed to assist developers in leveraging the power of AWS services and best practices. It provides code recommendations tailored to AWS application programming interfaces (APIs), allowing developers to efficiently interact with services such as Amazon EC2, Lambda, and Amazon S3. CodeWhisperer ensures that developers follow AWS best practices by providing code snippets that adhere to security measures, performance optimizations, and scalability considerations. By integrating AWS expertise directly into the coding process, CodeWhisperer empowers developers to build robust and reliable applications on the AWS platform.E. Enhanced security scanning and vulnerability detectionSecurity is a top priority in software development, and CodeWhisperer offers enhanced security scanning and vulnerability detection capabilities. It automatically scans both generated and developer-written code to identify potential security vulnerabilities. By leveraging industry-standard security guidelines and knowledge, CodeWhisperer helps developers identify and remediate security issues early in the development process. This proactive approach to security ensures that code is written with security in mind, reducing the risk of vulnerabilities and strengthening the overall security posture of applications.F. Responsible AI practices to address bias and open-source usageAWS CodeWhisperer is committed to responsible AI practices and addresses potential bias and open-source usage concerns. The AI models behind CodeWhisperer are trained on vast amounts of publicly available code, ensuring accuracy and relevance in code recommendations. However, CodeWhisperer goes beyond accuracy and actively filters out biased or unfair code recommendations, promoting inclusive coding practices. Additionally, it provides reference tracking to identify code recommendations that resemble specific open source training data, allowing developers to make informed decisions and attribute sources appropriately. By focusing on responsible AI practices, CodeWhisperer ensures that developers can trust the code suggestions and recommendations it provides.Setting up CodeWhisperer for individual developersIf you are an individual developer who has acquired CodeWhisperer independently and will be using AWS Builder ID for login, follow these steps to access CodeWhisperer from your JetBrains IDE:1.      Ensure that the AWS Toolkit for JetBrains is installed. If it is not already installed, you can install it from the JetBrains plugin marketplace.2.      In your JetBrains IDE, navigate to the edge of the window and click on the AWS Toolkit icon. This will open the AWS Toolkit for the JetBrains panel:3. Within the AWS Toolkit for JetBrains panel, click on the Developer Tools tab. This will open the Developer Tools Explorer.4. In the Developer Tools Explorer, locate the CodeWhisperer section and expand it. Then, select the "Start" option:5. A pop-up window titled "CodeWhisperer: Add a Connection to AWS" will appear. In this window, choose the "Use a personal email to sign up" option to sign in with your AWS Builder ID.6. Once you have entered your personal email associated with your AWS Builder ID, click on the "Connect" button to establish the connection and access CodeWhisperer within your JetBrains IDE:7.      A pop-up titled "Sign in with AWS Builder ID" will appear. Select the "Open and Copy Code" option.8.      A new browser tab will open, displaying the "Authorize request" window. The copied code should already be in your clipboard. Paste the code into the appropriate field and click "Next."9.      Another browser tab will open, directing you to the "Create AWS Builder ID" page. Enter your email address and click "Next." A field for your name will appear. Enter your name and click "Next." AWS will send a confirmation code to the email address you provided.10.   On the email verification screen, enter the code and click "Verify." On the "Choose your password" screen, enter a password, confirm it, and click "Create AWS Builder ID." A new browser tab will open, asking for your permission to allow JetBrains to access your data. Click "Allow."11.   Another browser tab will open, asking if you want to grant access to the AWS Toolkit for JetBrains to access your data. If you agree, click "Allow."12.   Return to your JetBrains IDE to continue the setup process. CodeWhisperer in ActionExample Use Case: Automating Unit Test Generation with CodeWhisperer in Python (Credits: aws-solutions-library-samples):One of the powerful use cases of CodeWhisperer is its ability to automate the generation of unit test code. By leveraging natural language comments, CodeWhisperer can recommend unit test code that aligns with your implementation code. This feature significantly simplifies the process of writing repetitive unit test code and improves overall code coverage.To demonstrate this capability, let's walk through an example using Python in Visual Studio Code:        Begin by opening an empty directory in your Visual Studio Code IDE.        (Optional) In the terminal, create a new Python virtual environment:python3 -m venv .venvsource .venv/bin/activate        Set up your Python environment and ensure that the necessary dependencies are installed.pip install pytest pytest-cov               Create a new file in your preferred Python editor or IDE and name it "calculator.py".       Add the following comment at the beginning of the file to indicate your intention to create a simple calculator class:   # example Python class for a simple calculator       Once you've added the comment, press the "Enter" key to proceed.       CodeWhisperer will analyze your comment and start generating code suggestions based on the desired functionality.      To accept the suggested code, simply press the "Tab" key in your editor or IDE.                                                            Picture Credit: aws-solutions-library-samplesIn case CodeWhisperer does not provide automatic suggestions, you can manually trigger CodeWhisperer to generate recommendations using the following keyboard shortcuts:For Windows/Linux users, press "Alt + C".For macOS users, press "Option + C".If you want to view additional suggestions, you can navigate through them by pressing the Right arrow key. On the other hand, to access previous suggestions, simply press the Left arrow key. If you wish to reject a recommendation, you can either press the ESC key or use the backspace/delete key.To continue building the calculator class, proceed by selecting the Enter key and accepting CodeWhisperer's suggestions, whether they are provided automatically or triggered manually. CodeWhisperer will propose basic functions for the calculator class, including add(), subtract(), multiply(), and divide(). In addition to these fundamental operations, it can also suggest more advanced functions like square(), cube(), and square_root().By following these steps, you can leverage CodeWhisperer to enhance your coding workflow and efficiently develop the calculator class, benefiting from a range of pre-generated functions tailored to your specific needs.ConclusionAWS CodeWhisperer is a groundbreaking tool that has the potential to revolutionize the way developers work. By harnessing the power of AI, CodeWhisperer provides real-time code suggestions and automates repetitive tasks, enabling developers to focus on solving core business problems. With seamless integration into popular IDEs and support for multiple programming languages, CodeWhisperer offers a comprehensive solution for developers across different domains. By leveraging CodeWhisperer's advanced features, developers can enhance their productivity, reduce errors, and ensure the delivery of high-quality code. As CodeWhisperer continues to evolve, it holds the promise of driving accelerated software development and fostering innovation in the developer community.Author BioRohan Chikorde is an accomplished AI Architect professional with a post-graduate in Machine Learning and Artificial Intelligence. With almost a decade of experience, he has successfully developed deep learning and machine learning models for various business applications. Rohan's expertise spans multiple domains, and he excels in programming languages such as R and Python, as well as analytics techniques like regression analysis and data mining. In addition to his technical prowess, he is an effective communicator, mentor, and team leader. Rohan's passion lies in machine learning, deep learning, and computer vision.LinkedIn
Read more
  • 0
  • 0
  • 243

article-image-co-pilot-microsoft-fabric-for-power-bi
Sagar Lad
23 Aug 2023
8 min read
Save for later

Co-Pilot & Microsoft Fabric for Power BI

Sagar Lad
23 Aug 2023
8 min read
IntroductionMicrosoft's data platform solution for the modern era is called Fabric. Microsoft's three primary data analytics tools:  Power BI, Azure Data Factory, and Azure Synapse all covered under Fabric. Advanced artificial intelligence capabilities built on machine learning and natural language processing (NLP) are made available to Power BI customers through Copilot. In this article, we will deep dive into how co-pilot and Microsoft Fabric will transform the way we develop and work with Power BI.Co-Pilot and Fabric with Power BIThe urgent requirement for businesses to turn their data into value is something that both Microsoft Fabric and Copilot aspire to address. Big Data continues to fall short of its initial promises even after years have passed. Every year, businesses generate more data, yet a recent IBM study found that 90% of this data is never successfully exploited for any kind of strategic purpose. So, more data does not mean more value or business insight. Data fragmentation and poor data quality are the key obstacles to releasing the value of data. These problems are what Microsoft hopes to address with Microsoft Fabric, a human-centric, end-to-end analytics product that brings together all of an organization's data and analytics in one place. Copilot has now been integrated into Power BI. Large multi-modal artificial intelligence models based on natural language processing have gained attention since the publication of ChatGPT. Beyond casuistry, Microsoft Fabric and Copilot share a trait in that they each want to transform the Power BI user interface.●       Microsoft Fabric and Power BIMicrosoft Fabric is just Synapse and Power BI together. By combining the benefits of the Power BI SaaS platform with the various Synapse workload types, Microsoft Fabric creates an environment that is more cohesive, integrated, and easier to use for all of the associated profiles. However, Power BI Premium users will get access to new opportunities for data science, data engineering, etc. Power BI will continue to function as it does right now. Data analysts and Power BI developers are not required to begin using Synapse Data Warehouse if they do not want to. Microsoft wants to combine all of its data offerings into one, called Fabric, just like it did with Office 365:Image 1: Microsoft Fabric (Source: Microsoft)Let’s understand in detail how Microsoft Fabric will make life easier for Power BI developers.1.     Data IngestionThere are various methods by which we can connect to data sources in Fabric in order to consume data. For example, utilising Spark notebooks or pipelines, for instance. This may be unknown to the Power BI realm, though.                                                       Image 2: Data Transformation in Power BI Instead, we can ingest the data using dataflows gen2, which will save it on OneLake in the proper format.2.     Ad Hoc Query One or more dataflows successfully published and refreshed will show in the workspace along with a number of other artifacts. The SQL Endpoint artifact is one of them. We can begin creating on-demand SQL queries and saving them as views after you open them. As an alternative, we can also create visual queries which will enable us to familiarise ourselves with the data flow diagram view. Above all, however, is the fact that this interface shares many characteristics with Power BI Data Marts, making it a familiar environment for those familiar with Power BI:   Image 3: Power BI - One Data Lake Hub3.     Data ModellingWith the introduction of web modelling for Power BI, we can introduce new metrics and start establishing linkages between different tables right away in this interface. The default workspace where the default dataset is kept will automatically contain the data model. The new storage option Direct Lake is advantageous for the datasets created in this manner via the cloud interface. By having just one copy of data in OneLake, this storage style prevents data duplication and unnecessary data refreshes.●       Co-Pilot and Power BI Copilot, a new artificial intelligence framework for Power BI is an offering from Microsoft. CoPilot is Power BI's expensive multimodal artificial intelligence model that is built on natural language processing. It might be compared to the ChatGPT of Power BI. Users will be able to ask inquiries about data, generate graphics, and DAX measures by providing a brief description of what they need thanks to the addition of Copilot to Power BI. For instance, it demonstrates how a brief statement of the user's preferences for the report:"Add a table of the top 500 MNC IT Companies by total sales to my model”.The DAX code required to generate measures and tables is generated automatically by the algorithm.Copilot enables:●       Power BI reports can be created and customized to provide insights.●       Create and improve DAX computations.●       Inquire about your data.●       Publish narrative summaries.●       Ease of Use●       Faster Time to Market Key Features of the Power BI Copilot are as follows: ●       Automated report generationCopilot can create well-designed dashboards, data narratives, and interactive components automatically, saving time and effort compared to manually creating reports.●       Conversational language interfaceWe can use everyday language to express data requests and inquiries, making it simpler to connect with your data and gain insights. ●        Real-time analyticsCopilot's real-time analytics capabilities can be used by Power BI customers to view data and react swiftly to shifts and trends. Let’s look at the step-by-step process on how to use Copilot for Power BI:Step 1: Open Power BI and go to the Copilot tab screen,Step 2:  Type a query pertaining to the data for example to produce a financial report or pick from a list of suggestions that Copilot has automatically prepared for you.Step 3: Copilot sorts through and analyses data to provide the information.Step 4: Copilot compiles a visually stunning report, successfully converting complex data into easily comprehended, practical information.Step 5: Investigate data even more by posing queries, writing summaries to present to stakeholders, and more. There are also a few limitations to using the Copilot features with Power BI: ●       Reliability for the recommendationsAll programming languages that are available in public sources have been taught to Copilot, ensuring the quality of its proposals. The quantity of the training dataset that is accessible for that language, however, may have an impact on the quality of the suggestions. APL, Erlang, and other specialized programming languages' suggestions won't be as useful as those for more widely used ones like Python, Java, etc.●       Privacy and security issuesThere are worries that the model, which was trained on publicly accessible code, can unintentionally recommend code fragments that have security flaws or were intended to be private.●       Dependence on comments and namingThe user is responsible for accuracy because the AI can provide suggestions that are more accurate when given specific comments and descriptive variable names.●       Lack of original solutions and inability to creatively solve problems. Unlike a human developer, the tool is unable to do either. It can only make code suggestions based on the training data.●       Inefficient codebaseThe tool is not designed for going through and comprehending big codebases. It works best when recommending code for straightforward tasks.ConclusionThe combination of Microsoft Copilot and Fabric with Power BI has the ability to completely alter the data modelling field. It blends sophisticated generative AI with data to speed up the discovery and sharing of insights by everyone. By enabling both data engineers and non-technical people to examine data using AI models, it is transforming Power BI into a human-centered analytics platform. Author Bio: Sagar Lad is a Cloud Data Solution Architect with a leading organization and has deep expertise in designing and building Enterprise-grade Intelligent Azure Data and Analytics Solutions. He is a published author, content writer, Microsoft Certified Trainer, and C# Corner MVP.Medium , Amazon , LinkedIn   
Read more
  • 0
  • 0
  • 425