How-To Tutorials

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-2

31 Aug 2023

10 min read

Building an API for Language Model Inference using Rust and Hyper - Part 2

31 Aug 2023

IntroductionIn our previous exploration, we delved deep into the world of Large Language Models (LLMs) in Rust. Through the lens of the llm crate and the transformative potential of LLMs, we painted a picture of the current state of AI integrations within the Rust ecosystem. But knowledge, they say, is only as valuable as its application. Thus, we transition from understanding the 'how' of LLMs to applying this knowledge in real-world scenarios.Welcome to the second part of our Rust LLM. In this article, we roll up our sleeves to architect and deploy an inference server using Rust. Leveraging the blazingly fast and efficient Hyper HTTP library, our server will not just respond to incoming requests but will think, infer, and communicate like a human. We'll guide you through the step-by-step process of setting up, routing, and serving inferences right from the server, all the while keeping our base anchored to the foundational insights from our last discussion.For developers eager to witness the integration of Rust, Hyper, and LLMs, this guide promises to be a rewarding endeavor. By the end, you'll be equipped with the tools to set up a server that can converse intelligently, understand prompts, and provide insightful responses. So, as we progress from the intricacies of the llm crate to building a real-world application, join us in taking a monumental step toward making AI-powered interactions an everyday reality.Imports and Data StructuresLet's start by looking at the import statements and data structures used in the code:use hyper::service::{make_service_fn, service_fn}; use hyper::{Body, Request, Response, Server}; use std::net::SocketAddr; use serde::{Deserialize, Serialize}; use std::{convert::Infallible, io::Write, path::PathBuf};hyper: Hyper is a fast and efficient HTTP library for Rust.SocketAddr: This is used to specify the socket address (IP and port) for the server.serde: Serde is a powerful serialization/deserialization framework in Rust.Deserialize, Serialize: Serde traits for automatic serialization and deserialization.Next, we have the data structures that will be used for deserializing JSON request data and serializing response data:#[derive(Debug, Deserialize)] struct ChatRequest { prompt: String, } #[derive(Debug, Serialize)] struct ChatResponse { response: String, }1. ChatRequest: A struct to represent the incoming JSON request containing a prompt field.2. ChatResponse: A struct to represent the JSON response containing a response field.Inference FunctionThe infer function is responsible for performing language model inference:fn infer(prompt: String) -> String { let tokenizer_source = llm::TokenizerSource::Embedded; let model_architecture = llm::ModelArchitecture::Llama; let model_path = PathBuf::from("/path/to/model"); let prompt = prompt.to_string(); let now = std::time::Instant::now(); let model = llm::load_dynamic( Some(model_architecture), &model_path, tokenizer_source, Default::default(), llm::load_progress_callback_stdout, ) .unwrap_or_else(|err| { panic!("Failed to load {} model from {:?}: {}", model_architecture, model_path, err); }); println!( "Model fully loaded! Elapsed: {}ms", now.elapsed().as_millis() ); let mut session = model.start_session(Default::default()); let mut generated_tokens = String::new(); // Accumulate generated tokens here let res = session.infer::<Infallible>( model.as_ref(), &mut rand::thread_rng(), &llm::InferenceRequest { prompt: (&prompt).into(), parameters: &llm::InferenceParameters::default(), play_back_previous_tokens: false, maximum_token_count: Some(140), }, // OutputRequest &mut Default::default(), |r| match r { llm::InferenceResponse::PromptToken(t) | llm::InferenceResponse::InferredToken(t) => { print!("{t}"); std::io::stdout().flush().unwrap(); // Accumulate generated tokens generated_tokens.push_str(&t); Ok(llm::InferenceFeedback::Continue) } _ => Ok(llm::InferenceFeedback::Continue), }, ); // Return the accumulated generated tokens match res { Ok(_) => generated_tokens, Err(err) => format!("Error: {}", err), } }The infer function takes a prompt as input and returns a string containing generated tokens.It loads a language model, sets up an inference session, and accumulates generated tokens.The res variable holds the result of the inference, and a closure handles each inference response.The function returns the accumulated generated tokens or an error message.Request HandlerThe chat_handler function handles incoming HTTP requests:async fn chat_handler(req: Request<Body>) -> Result<Response<Body>, Infallible> { let body_bytes = hyper::body::to_bytes(req.into_body()).await.unwrap(); let chat_request: ChatRequest = serde_json::from_slice(&body_bytes).unwrap(); // Call the `infer` function with the received prompt let inference_result = infer(chat_request.prompt); // Prepare the response message let response_message = format!("Inference result: {}", inference_result); let chat_response = ChatResponse { response: response_message, }; // Serialize the response and send it back let response = Response::new(Body::from(serde_json::to_string(&chat_response).unwrap())); Ok(response) }chat_handler asynchronously handles incoming requests by deserializing the JSON payload.It calls the infer function with the received prompt and constructs a response message.The response is serialized as JSON and sent back in the HTTP response.Router and Not Found HandlerThe router function maps incoming requests to the appropriate handlers:The router function maps incoming requests to the appropriate handlers: async fn router(req: Request<Body>) -> Result<Response<Body>, Infallible> { match (req.uri().path(), req.method()) { ("/api/chat", &hyper::Method::POST) => chat_handler(req).await, _ => not_found(), } }router matches incoming requests based on the path and HTTP method.If the path is "/api/chat" and the method is POST, it calls the chat_handler.If no match is found, it calls the not_found function.Main FunctionThe main function initializes the server and starts listening for incoming connections:#[tokio::main] async fn main() { println!("Server listening on port 8083..."); let addr = SocketAddr::from(([0, 0, 0, 0], 8083)); let make_svc = make_service_fn(|_conn| { async { Ok::<_, Infallible>(service_fn(router)) } }); let server = Server::bind(&addr).serve(make_svc); if let Err(e) = server.await { eprintln!("server error: {}", e); } }In this section, we'll walk through the steps to build and run the server that performs language model inference using Rust and the Hyper framework. We'll also demonstrate how to make a POST request to the server using Postman.1. Install Rust: If you haven't already, you need to install Rust on your machine. You can download Rust from the official website: https://www.rust-lang.org/tools/install2. Create a New Rust Project: Create a new directory for your project and navigate to it in the terminal. Run the following command to create a new Rust project: cargo new language_model_serverThis command will create a new directory named language_model_server containing the basic structure of a Rust project.3. Add Dependencies: Open the Cargo.toml file in the language_model_server directory and add the required dependencies for Hyper and other libraries. Your Cargo.toml file should look something like this: [package] name = "llm_handler" version = "0.1.0" edition = "2018" [dependencies] hyper = {version = "0.13"} tokio = { version = "0.2", features = ["macros", "rt-threaded"]} serde = {version = "1.0", features = ["derive"] } serde_json = "1.0" llm = { git = "<https://github.com/rustformers/llm.git>" } rand = "0.8.5"Make sure to adjust the version numbers according to the latest versions available.4. Replace Code: Replace the content of the src/main.rs file in your project directory with the code you've been provided in the earlier sections.5. Building the Server: In the terminal, navigate to your project directory and run the following command to build the server: cargo build --releaseThis will compile your code and produce an executable binary in the target/release directory.Running the Server1. Running the Server: After building the server, you can run it using the following command: cargo run --releaseYour server will start listening on the port 8083.2. Accessing the Server: Open a web browser and navigate to http://localhost:8083. You should see the message "Not Found" indicating that the server is up and running.Making a POST Request Using Postman1. Install Postman: If you don't have Postman installed, you can download it from the official website: https://www.postman.com/downloads/2. Create a POST Request:o Open Postman and create a new request.o Set the request type to "POST".o Enter the URL: http://localhost:8083/api/chato In the "Body" tab, select "raw" and set the content type to "JSON (application/json)".o Enter the following JSON request body: { "prompt": "Rust is an amazing programming language because" }3. Send the Request: Click the "Send" button to make the POST request to your server. 4. View the Response: You should receive a response from the server, indicating the inference result generated by the language model.ConclusionIn the previous article, we introduced the foundational concepts, setting the stage for the hands-on application we delved into this time. In this article, our main goal was to bridge theory with practice. Using the llm crate alongside the Hyper library, we embarked on a mission to create a server capable of understanding and executing language model inference. But our work was more than just setting up a server; it was about illustrating the synergy between Rust, a language famed for its safety and concurrency features, and the vast world of AI.What's especially encouraging is how this project can serve as a springboard for many more innovations. With the foundation laid out, there are numerous avenues to explore, from refining the server's performance to integrating more advanced features or scaling it for larger audiences.If there's one key takeaway from our journey, it's the importance of continuous learning and experimentation. The tech landscape is ever-evolving, and the confluence of AI and programming offers a fertile ground for innovation.As we conclude this series, our hope is that the knowledge shared acts as both a source of inspiration and a practical guide. Whether you're a seasoned developer or a curious enthusiast, the tools and techniques we've discussed can pave the way for your own unique creations. So, as you move forward, keep experimenting, iterating, and pushing the boundaries of what's possible. Here's to many more coding adventures ahead!Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
239

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-1

Alan Bernardo Palacio

31 Aug 2023

7 min read

Building an API for Language Model Inference using Rust and Hyper - Part 1

Alan Bernardo Palacio

31 Aug 2023

7 min read

IntroductionIn the landscape of artificial intelligence, the capacity to bring sophisticated Large Language Models (LLMs) to commonplace applications has always been a sought-after goal. Enter LLM, a groundbreaking Rust library crafted by Rustformers, designed to make this dream a tangible reality. By focusing on the intricate synergy between the LLM library and the foundational GGML project, this toolset pushes the boundaries of what's possible, enabling AI enthusiasts to harness the sheer might of LLMs on conventional CPUs. This shift in dynamics owes much to GGML's pioneering approach to model quantization, streamlining computational requirements without sacrificing performance.In this comprehensive guide, we'll embark on a journey that starts with understanding the essence of the llm crate and its seamless interaction with a myriad of LLMs. Delving into its intricacies, we'll illuminate how to integrate, interact, and infer using these models. And as a tantalizing glimpse into the realm of practical application, our expedition won't conclude here. In the subsequent installment, we'll rise to the challenge of crafting a web server in Rust—one that confidently runs inference directly on a CPU, making the awe-inspiring capabilities of AI not just accessible, but an integral part of our everyday digital experiences.This is a two-part article in the first section we will discuss the basic interaction with the library and in the following we build a server in Rust that allow us to build our own web applications using state-of-the-art LLMs. Let’s begin with it.Harnessing the Power of Large Language ModelsAt the very core of LLM's architecture resides the GGML project, a tensor library meticulously crafted in the C programming language. GGML, short for "General GPU Machine Learning," serves as the bedrock of LLM, enabling the intricate orchestration of large language models. Its quintessence lies in a potent technique known as model quantization.Model quantization, a pivotal process employed by GGML, involves the reduction of numerical precision within a machine-learning model. This entails transforming the conventional 32-bit floating-point numbers frequently used for calculations into more compact representations such as 16-bit or even 8-bit integers.Quantization can be considered as the act of chiseling away unnecessary complexities while sculpting a model. Model quantization adeptly streamlines resource utilization without inordinate compromises on performance. By default, models lean on 32-bit floating-point numbers for their arithmetic operations. With quantization, this intricacy is distilled into more frugal formats, such as 16-bit integers or even 8-bit integers. It's an artful equilibrium between computational efficiency and performance optimization.GGML's versatility can be seen through a spectrum of quantization strategies: spanning 4, 5, and 8-bit quantization. Each strategy allows for improvement in efficiency and execution in different ways. For instance, 4-bit quantization thrives in memory and computational frugality, although it could potentially induce a performance decrease compared to the broader 8-bit quantization.The Rustformers library allows to integration of different language models including Bloom, GPT-2, GPT-J, GPT-NeoX, Llama, and MPT. To use these models within the Rustformers library, they undergo a transformation to align with GGML's technical underpinnings. The authorship has generously provided pre-engineered models on the Hugging Face platform, facilitating seamless integration.In the next sections, we will use the llm crate to run inference on LLM models like Llama. The realm of AI innovation is beckoning, and Rustformers' LLM, fortified by GGML's techniques, forms an alluring gateway into its intricacies.Getting Started with LLM-CLIThe Rustformers group has the mission of amplifying access to the prowess of large language models (LLMs) at the forefront of AI evolution. The group focuses on harmonizing with the rapidly advancing GGML ecosystem – a C library harnessed for quantization, enabling the execution of LLMs on CPUs. The trajectory extends to supporting diverse backends, embracing GPUs, Wasm environments, and more.For Rust developers venturing into the realm of LLMs, the key to unlocking this potential is the llm crate – the gateway to Rustformers' innovation. Through this crate, Rust developers interface with LLMs effortlessly. The "llm" project also offers a streamlined CLI for interacting with LLMs and examples showcasing its integration into Rust projects. More insights can be gained from the GitHub repository or its official documentation for released versions.To embark on your LLM journey, initiate by installing the LLM-CLI package. This package materializes the model's essence onto your console, allowing for direct inference.Getting started is a streamlined process:Clone the repository.Install the llm-cli tool from the repository.Download your chosen models from Hugging Face. In our illustration, we employ the Llama model with 4-bit quantization.Run inference on the model using the CLI tool and reference the model and architecture of the model downloaded previously.So let’s start with it. First, let's install llm-cli using this command:cargo install llm-cli --git <https://github.com/rustformers/llm>Next, we proceed by fetching your desired model from Hugging Face:curl -LO <https://huggingface.co/rustformers/open-llama-ggml/resolve/main/open_llama_3b-f16.bin>Finally, we can initiate a dialogue with the model using a command akin to:llm infer -a llama -m open_llama_3b-f16.bin -p "Rust is a cool programming language because"We can see how the llm crate stands to facilitate seamless interactions with LLMs.This project empowers developers with streamlined CLI tools, exemplifying the LLM integration into Rust projects. With installation and model preparation effortlessly explained, the journey toward LLM proficiency commences. As we transition to the culmination of this exploration, the power of LLMs is within reach, ready to reshape the boundaries of AI engagement.Conclusion: The Dawn of Accessible AI with Rust and LLMIn this exploration, we've delved deep into the revolutionary Rust library, LLM, and its transformative potential to bring Large Language Models (LLMs) to the masses. No longer is the prowess of advanced AI models locked behind the gates of high-end GPU architectures. With the symbiotic relationship between the LLM library and the underlying GGML tensor architecture, we can seamlessly run language models on standard CPUs. This is made possible largely by the potent technique of model quantization, which GGML has incorporated. By optimizing the balance between computational efficiency and performance, models can now run in environments that were previously deemed infeasible.The Rustformers' dedication to the cause shines through their comprehensive toolset. Their offerings extend from pre-engineered models on Hugging Face, ensuring ease of integration, to a CLI tool that simplifies the very interaction with these models. For Rust developers, the horizon of AI integration has never seemed clearer or more accessible.As we wrap up this segment, it's evident that the paradigm of AI integration is rapidly shifting. With tools like the llm crate, developers are equipped with everything they need to harness the full might of LLMs in their Rust projects. But the journey doesn't stop here. In the next part of this series, we venture beyond the basics, and into the realm of practical application. Join us as we take a leap forward, constructing a web server in Rust that leverages the llm crate.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
210

article-image-spark-and-langchain-for-data-analysis

Alan Bernardo Palacio

31 Aug 2023

12 min read

Spark and LangChain for Data Analysis

Alan Bernardo Palacio

31 Aug 2023

12 min read

0
0
1404

article-image-unleashing-the-power-of-wolfram-alpha-api-with-python-and-chatgpt

Alan Bernardo Palacio

31 Aug 2023

6 min read

Unleashing the Power of Wolfram Alpha API with Python and ChatGPT

Alan Bernardo Palacio

31 Aug 2023

6 min read

IntroductionIn the ever-evolving landscape of artificial intelligence, a groundbreaking collaboration has emerged between Wolfram Alpha and ChatGPT, giving birth to an extraordinary plugin: the AI Advantage. This partnership bridges the gap between ChatGPT's proficiency in natural language processing and Wolfram Alpha's computational prowess. The result? A fusion that unlocks an array of new possibilities, revolutionizing the way we interact with AI. In this hands-on tutorial, we're embarking on a journey to explore the power of the Wolfram Alpha API, demonstrate its integration with Python and ChatGPT, and empower you to tap into this dynamic duo for tasks ranging from complex calculations to real-time data retrieval.Understanding Wolfram Alpha APIImagine having an intelligent assistant at your fingertips, capable of not only understanding your questions but also providing detailed computational insights. That's where Wolfram Alpha shines. It's more than just a search engine; it's a computational knowledge engine. Whether you need to solve a math problem, retrieve real-time data, or generate visual content, Wolfram Alpha has you covered. Its unique ability to compute answers based on structured data sets it apart from traditional search engines.So, how can you tap into this treasure trove of computational knowledge? Enter the Wolfram Alpha API. This API exposes Wolfram Alpha's capabilities for developers to harness in their applications. Whether you're building a chatbot, a data analysis tool, or an educational resource, the Wolfram Alpha API can provide you with instant access to accurate and in-depth information. The API supports a wide range of queries, from straightforward calculations to complex data retrievals, making it a versatile tool for various use cases.Integrating Wolfram Alpha API with ChatGPTChatGPT's strength lies in its ability to understand and generate human-like text based on input. However, when it comes to intricate calculations or pulling real-time data, it benefits from a partner like Wolfram Alpha. By integrating the two, you create a dynamic synergy where ChatGPT can effortlessly tap into Wolfram Alpha's computational engine to provide accurate and data-driven responses. This collaboration bridges the gap between language understanding and computation, resulting in a well-rounded AI interaction.Before we dive into the technical implementation, let's get you set up to take advantage of the Wolfram Alpha plugin for ChatGPT. First, ensure you have access to ChatGPT+. To enable the Wolfram plugin, follow these steps:Open the ChatGPT interface.Navigate to "Settings."Look for the "Beta Features" section.Enable "Plugins" under the GPT-4 options.Once "Plugins" is enabled, locate and activate the Wolfram plugin.With the plugin enabled you're ready to harness the combined capabilities of ChatGPT and Wolfram Alpha API, making your AI interactions more robust and informative.In the next sections, we'll dive into practical applications and walk you through implementing the integration using Python and ChatGPT.Practical Applications with Code ExamplesLet's start by exploring how the Wolfram Alpha API can assist with complex mathematical tasks. Below are code examples that demonstrate the integration between ChatGPT and Wolfram Alpha to solve intricate math problems. In these scenarios, ChatGPT serves as the bridge between you and Wolfram Alpha, seamlessly delivering accurate solutions.Before diving into the code implementation, let's ensure your environment is ready to go. Follow these steps to set up the necessary components:Install the required packages: Make sure you have the necessary Python packages installed. You can use pip to install them:pip install langchain openai wolframalphaNow, let's walk through implementing the code example you provided earlier. This code integrates the Wolfram Alpha API with ChatGPT to provide accurate and informative responses:Wolfram Alpha can solve simple arithmetic queries:# User input question = "Solve for x: 2x + 5 = 15" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response result = response['text'] print("Solution:", result)Or mode complex ones like calculating integrals:# User input question = "Calculate the integral of x^2 from 0 to 5" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Integral:", response) Real-time Data RetrievalIncorporating real-time data into conversations can greatly enhance the value of AI interactions. Here are code examples showcasing how to retrieve up-to-date information using the Serper API and integrate it seamlessly into the conversation:# User input question = "What's the current exchange rate between USD and EUR?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Exchange Rate:", response)We can also ask for the current weather forecast:# User input question = "What's the weather forecast for London tomorrow?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Weather Forecast:", response)Now we can put everything together into a single block including all the required library imports and use both real time data with Serper and use the reasoning skills of Wolfram Alpha.# Import required libraries from langchain.agents import load_tools, initialize_agent from langchain.llms import OpenAI from langchain.memory import ConversationBufferMemory from langchain.chat_models import ChatOpenAI # Set environment variables import os os.environ['OPENAI_API_KEY'] = 'your-key' os.environ['WOLFRAM_ALPHA_APPID'] = 'your-key' os.environ["SERPER_API_KEY"] = 'your-key' # Initialize the ChatGPT model llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo") # Load tools and set up memory tools = load_tools(["google-serper", "wolfram-alpha"], llm=llm) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) # Initialize the agent agent_chain = initialize_agent(tools, llm, handle_parsing_errors=True, verbose=True, memory=memory) # Interact with the agent response_weather = agent_chain.run(input="what is the weather in Amsterdam right now in celcius? Don't make assumptions.") response_flight = agent_chain.run(input="What's a good price for a flight from JFK to AMS this weekend? Express the price in Euros. Don't make assumptions.")ConclusionIn this tutorial, we've delved into the exciting realm of integrating the Wolfram Alpha API with Python and ChatGPT. We've explored how this collaboration empowers you to tackle complex mathematical tasks and retrieve real-time data seamlessly. By harnessing the capabilities of both Wolfram Alpha and ChatGPT, you've unlocked a powerful synergy that's capable of transforming your AI interactions. As you continue to explore and experiment with this integration, you'll discover new ways to enhance your interactions and leverage the strengths of each tool. So, why wait? Start your journey toward more informative and engaging AI interactions today.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
3769

article-image-transformer-building-blocks

Saeed Dehqan

29 Aug 2023

22 min read

Transformer Building Blocks

Saeed Dehqan

29 Aug 2023

22 min read

0
0
1503

article-image-exploring-token-generation-strategies

Saeed Dehqan

28 Aug 2023

8 min read

Exploring Token Generation Strategies

Saeed Dehqan

28 Aug 2023

8 min read

0
0
2468

article-image-text-classification-with-transformers

Saeed Dehqan

28 Aug 2023

9 min read

Text Classification with Transformers

Saeed Dehqan

28 Aug 2023

9 min read

IntroductionThis blog aims to implement binary text classification using a transformer architecture. If you're new to transformers, the "Transformer Building Blocks" blog explains the architecture and its text generation implementation. Beyond text generation and translation, transformers serve classification, sentiment analysis, and speech recognition. The transformer model comprises two parts: an encoder and a decoder. The encoder extracts features, while the decoder processes them. Just as a painter with tree features can draw, describe, visualize, categorize, or write about a tree, transformers encode knowledge (encoder) and apply it (decoder). This dual-part process is pivotal for text classification with transformers, allowing them to excel in diverse tasks like sentiment analysis, illustrating their transformative role in NLP.Deep Dive into Text Classification with TransformersWe train the model on the IMDB dataset. The dataset is ready and there’s no preprocessing needed. The model is vocab-based instead of character-based so that the model can converge faster. I limited the dataset vocabs to the 20000 most frequent vocabs. I also reduced the sequence to 200 so we can train faster. I tried to simplify the model and use torch.nn.MultiheadAttention it instead of writing the Multihead-attention ourselves. It makes the model faster since the nn.MultiheadAttention uses scaled_dot_product_attention under the hood. But if you want to know how MultiheadAttention works you can study the transformer building blocks blog or see the code here.Okay, now, let us add the feature extractor part:class transformer_block(nn.Module): def __init__(self): super(block, self).__init__() self.attention = nn.MultiheadAttention(embeds_size, num_heads, batch_first=True) self.ffn = nn.Sequential( nn.Linear(embeds_size, 4 * embeds_size), nn.LeakyReLU(), nn.Linear(4 * embeds_size, embeds_size), ) self.drop1 = nn.Dropout(drop_prob) self.drop2 = nn.Dropout(drop_prob) self.ln1 = nn.LayerNorm(embeds_size, eps=1e-6) self.ln2 = nn.LayerNorm(embeds_size, eps=1e-6) def forward(self, hidden_state): attn, _ = self.attention(hidden_state, hidden_state, hidden_state, need_weights=False) attn = self.drop1(attn) out = self.ln1(hidden_state + attn) observed = self.ffn(out) observed = self.drop2(observed) return self.ln2(out + observed)● hidden_state: A tensor with a shape (batch_size, block_size, embeds_size) goes to the transformer_block and a tensor with the same shape goes out of it.● self.attention: The transformer block tries to combine the information of tokens so that each token is aware of its neighbors or other tokens in the context. We may call this part the communication part. That’s what the nn.MultiheadAttention does. nn.MultiheadAttention is a ready multihead attention layer that can be faster than implementing it from scratch, just like what we did in the “Transformer Building Blocks” blog. The parameters of nn.MultiheadAttention are as follows: ○ embeds_size: token embedding size ○ num_heads: multihead, as the name suggests, consists of multiple heads and each head works on different parts of token embeddings. Suppose, your input data has shape (B,T,C) = (10, 32, 16). The token embedding size for this data is 16. If we specify the num_heads parameter to 2(divisible by 16), the multi-head splits data into two parts with shape (10, 32, 8). The first head works on the first part and the second head works on the second part. This is because transforming data into different subspaces can help the model to see different aspects of the data. Please note that the num_heads should be divisible by the embedding size so that at the end we can concatenate the split parts. ○ batch_first: True means the first dimension is batch.● Dropout: After the attention layer, the communication between tokens is closed and computations on tokens are done individually. We run a dropout on tokens. Dropout is a method of regularization. Regularization helps the training process to be based on generalization, not memorization. Without regularization, the model tries to memorize the training set and has poor performance on the test set. The dropout method turns off features with a probability of drop_prob.● self.ln1: Layer normalization normalizes embeddings so that they have zero mean and standard deviation one.● Residual connection: hidden_state + attn: Observe that before normalization, we added the input to the output of multihead attention, named residual connection. It has two benefits: ○ It helps the model to have the unchanged embedding information. ○ It helps to prevent gradient vanishing, which is common in deep networks where we stack multiple transformer layers.● self.ffn: After dropout, residual connection, and normalization, we forward data into a simple non-linear neural network to adjust the tokens one by one for better representation.● self.ln2(out + observed): Finally, another dropout, residual connection, and layer normalization.The transformer block is ready. And here is the final piece:class transformer(nn.Module): def __init__(self): super(transformer, self).__init__() self.tok_embs = nn.Embedding(vocab_size, embeds_size) self.pos_embs = nn.Embedding(block_size, embeds_size) self.block = block() self.ln1 = nn.LayerNorm(embeds_size) self.ln2 = nn.LayerNorm(embeds_size) self.classifier_head = nn.Sequential( nn.Linear(embeds_size, embeds_size), nn.LeakyReLU(), nn.Dropout(drop_prob), nn.Linear(embeds_size, embeds_size), nn.LeakyReLU(), nn.Linear(embeds_size, num_classes), nn.Softmax(dim=1), ) print("number of parameters: %.2fM" % (self.num_params()/1e6,)) def num_params(self): n_params = sum(p.numel() for p in self.parameters()) return n_params def forward(self, seq): B,T = seq.shape embedded = self.tok_embs(seq) embedded = embedded + self.pos_embs(torch.arange(T, device=device)) output = self.block(embedded) output = output.mean(dim=1) output = self.classifier_head(output) return output● self.tok_embs: nn.Embedding is like a lookup table that receives a sequence of indices, and returns their corresponding embeddings. These embeddings will receive gradients so that the model can update them to make better predictions.● self.tok_embs: To comprehend a sentence, you not only need words, you also need to have the order of words. Here, we embed positions and add them to the token embeddings. In this way, the model has both words and their order.● self.block: In this model, we only use one transformer block, but you can stack more blocks to get better results.● self.classifier_head: This is where we put the extracted information into action to classify the sequence. We call it the transformer head. It receives a fixed-size vector and classifies the sequence. The softmax as the final activation function returns a probability distribution for each class.● self.tok_embs(seq): Given a sequence of indices (batch_size, block_size), it returns (batch_size, block_size, embeds_size).● self.pos_embs(torch.arange(T, device=device)): Given a sequence of positions, i.e. [0,1,2], it returns embeddings of each position. Then, we add them to the token embeddings.● self.block(embedded): The embedding goes to the transformer block to extract features. Given the embedded shape (batch_size, block_size, embeds_size), the output has the same shape (batch_size, block_size, embeds_size).● output.mean(dim=1): The purpose of using mean is to aggregate the information from the sequence into a compact representation before feeding it into self.classifier_head. It helps in reducing the spatial dimensionality and extracting the most important features from the sequence. Given the input shape (batch_size, block_size, embeds_size), the output shape is (batch_size, embeds_size). So, one fixed-size vector for each batch.● self.classifier_head(output): And here we classify.The final code can be found here. The remaining code consists of downstream tasks such as the training loop, loading the dataset, setting the hyperparameters, and optimizer. I used RMSprop instead of Adam and AdamW. I also used BCEWithLogitsLoss instead of cross-entropy loss. BCE(Binary Cross Entropy) is for binary classification models and it combines sigmoid with cross entropy and it is numerically more stable. I also empirically got better accuracy. After 30 epochs, the final accuracy is ~84%.ConclusionThis exploration of text classification using transformers reveals their revolutionary potential. Beyond text generation, transformers excel in sentiment analysis. The encoder-decoder model, analogous to a painter interpreting tree feature, propels efficient text classification. A streamlined practical approach and the meticulously crafted transformer block enhance the architecture's robustness. Through optimization methods and loss functions, the model is honed, yielding an empirically validated 84% accuracy after 30 epochs. This journey highlights transformers' disruptive impact on reshaping AI-driven language comprehension, fundamentally altering the landscape of Natural Language Processing.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc.

0
0
2352

article-image-designing-decoder-only-transformer-models-like-chatgpt

Saeed Dehqan

28 Aug 2023

9 min read

Designing Decoder-only Transformer Models like ChatGPT

Saeed Dehqan

28 Aug 2023

9 min read

IntroductionEmbark on an enlightening journey into the ChatGPT stack, a remarkable feat in AI-driven language generation. Unveiling its evolution from inception to a proficient AI assistant, we delve into decoder-only transformers, specialized for crafting Shakespearean verses and informative responses.Throughout this exploration, we dissect the four integral stages that constitute the ChatGPT stack. From exhaustive pretraining to fine-tuned supervised training, we unravel how rewards and reinforcement learning refine response generation to align with context and user intent.In this blog, we will get acquainted briefly with the ChatGPT stack and then implement a simple decoder-only transformer to train on Shakespeare.Creating ChatGPT models consists of four main stages:1. Pretraining:2. Supervised Fine Tuning3. Reward modeling4. Reinforcement learningThe Pretraining stage takes most of the computational time since we train the language model on trillions of tokens. The following table shows the Data Mixtures used for pretraining of LLaMA Meta Models [0]:The datasets come and mix together, according to the sampling proportion, to create the pretraining data. The table shows the datasets along with their corresponding sampling proportion (What portion of the pre-trained data is the dataset?), epochs (How many times do we train the model on the corresponding datasets?), and dataset size. It is obvious that the epoch of high-quality datasets such as Wikipedia, and Books is high and as a result, the model grasps high-quality datasets better.After we have our dataset ready, the next step is Tokenization before training. Tokenizing data means mapping all the text data into a large list of integers. In language modeling repositories, we usually have two dictionaries for mapping tokens (a token is a sub word. Like ‘wait’, and ‘ing’ are two tokens.) into integers and vice versa. Here is an example:In [1]: text = "it is obvious that the epoch of high .." In [2]: tokens = list(set(text.split())) In [3]: stoi = {s:i for i,s in enumerate(tokens)} In [4]: itos = {i:s for s,i in stoi.items()} In [5]: stoi['it'] Out[5]: 22 In [6]: itos[22] Out[6]: 'it'Now, we can tokenize texts with the following functions:In [7]: encode = lambda text: [stoi[x] for x in text.split()] In [8]: decode = lambda encoded: ' '.join([itos[x] for x in encoded]) In [9]: tokenized = encode(text) In [10]: tokenized Out[10]: [22, 19, 18, 5, ...] In [11]: decode(tokenized) Out[11]: 'it is obvious that the epoch of high ..'Suppose the tokenized variable contains all the tokens converted to integers (say 1 billion tokens). We select 3 chunks of the list randomly that each chunk contains 10 tokens and feed-forward them into a transformer language model to predict the next token. The model’s input has a shape (3, 10), here 3 is batch size and 5 is context length. The model tries to predict the next token for each chunk independently. We select 3 chunks and predict the next token for each chunk to speed up the training process. It is like running the model on 3 chunks of data at once. You can increase the batch size and context length depending on the requirements and resources. Here’s an example:For convenience, we wrote the token indices along with the corresponding tokens. For each chunk or sequence, the model predicts the whole sequence. Let’s see how this works:By seeing the first token (it), the model predicts the next token (is). The context token(s) is ‘it’ and the target token for the model is ‘is’. If the model fails to predict the target token, we do backpropagation to adjust model parameters so the model can predict correctly.During the process, we mask out or hide the future tokens so that the model can’t have access to the future tokens. Because it is kind of cheating. We want the model itself to predict the future by only seeing the past tokens. That makes sense, right? That’s why we used a gray background for future tokens, which means the model is not able to see them.After predicting the second token, we have two tokens [it, is] as context to predict what token comes next in the sequence. Here is the third token (obvious).By using the three previous tokens [it, is, obvious], the model needs to predict the fourth token (that). And as usual, we hide the future tokens (in this case ‘the’).We give [it, is, obvious, that] to the model as the context in order to predict ‘the’. And finally, we give all the sequence as context [it, is, obvious, that, the] to predict the next token.We have five predictions for a sequence with a length of five.After training the model on a lot of randomly selected sequences from the pre-trained dataset, the model should be ready to autocomplete your sequence. Give it a sequence of tokens, and then, it predicts the next token and based on what was predicted plus previous tokens, the model predicts the next tokens one by one. We call it an autoregressive model. That’s it.But, at this stage, the model is not an AI assistant or a chatbot. It only receives a sequence and tries to complete the sequence. That’s how we trained the model. We don’t train it to answer questions and listen to the instructions. We give it context tokens and the model tries to predict the next token based on the context.You give it this:“In order to be irrational, you first need to”And the model continues the sequence:“In order to be irrational, you first need to abandon logical reasoning and disregard factual evidence.”Sometimes, you ask it an instruction:“Write a function to count from 1 to 100.”And instead of trying to write a function, the model answers with more similar instructions:“Write a program to sort an array of integers in ascending order.”“Write a script to calculate the factorial of a given number.”“Write a method to validate a user's input and ensure it meets the specified criteria.”“Write a function to check if a string is a palindrome or not.”That’s where prompt engineering came in. People tried to use some tricks to get the answer to a question out of the model.Give the model the following prompt:“London is the capital of England.Copenhagen is the capital of Denmark.Oslo is the capital of”The model answers like this:“Norway.”So, we managed to get something helpful out of it with prompt engineering. But we don’t want to provide examples every time. We want to ask it a question and receive an answer. To prepare the model to be an AI assistant, we need further training named Supervised Fine Tuning for instructional purposes.In the Supervised Fine-Tuning stage, we make the model instructional. To achieve this goal the model needs training on a high quality 15k-100K of prompt and response dataset. Here’s an example of it: { "instruction": "When was the last flight of Concorde?", "context": "", "response": "On 26 November 2003", "category": "open_qa" }This example was taken from the databricks-dolly-15k dataset that is an open-source dataset for Supervised/Instruction Fine Tuning[1]. You can download the dataset from here. Instructions have seven categorizations including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This is because we want to train the model in different tasks. For instance, the above instruction is open QA, meaning the question is a general one and does not require reasoning abilities. It teaches the model to answer general questions. Closed QA requires reasoning abilities. During Instruction fine-tuning, nothing will change algorithmically. We do the same process as the previous stage (Pretraining). We gave instructions as context tokens and we want the model to continue the sequence with response.We continue this process for thousands of examples and then, the model is ready to be instructional. But that’s not the end of the story of the model behind ChatGPT. OpenAI designed a supervised reward modeling that returns a reward for the sequences that were made by the base model for the same input prompt. They give the model a prompt and run the model four times, for instance, to have four different answers for the same prompt. The model produces different answers each time because of the sampling method they use. Then, the reward model receives the input prompt and the produced answers to get a reward score for each answer. The better the answer, the better the reward score is. The model requires ground-truth scores to be trained and these scores came from labelers who worked for OpenAI. Labelers were given prompt text and model responses and they ranked them from the best to the worst.At the final stage, the ChatGPT uses Reinforcement Learning with Human Feedback (RLHF) to generate responses that get the best scores from the rewarding model. RL is an architecture that tries to find the best way of achieving a goal. The goal can be checkmate in chess or creating the best answer for the input prompt. The RL learning process is like doing an action and getting a reward or penalty for the action. And we do not take actions that end up penalizing. RLHF is what made ChatGPT so good:The PPO-ptx shows the win rate of GPT + RLHF compared to SFT (Supervised Fine-Tuned model), GPT with prompt engineering, and GPT base.ConclusionIn summation, the ChatGPT stack exemplifies AI's potent fusion with language generation. From inception to proficient AI assistant, we've traversed core stages – pretraining, fine-tuning, and reinforcement learning. Decoder-only transformers have enlivened Shakespearean text and insights.Tokenization's role in enabling ChatGPT's prowess concludes our journey. This AI evolution showcases technology's synergy with creative text generation.ChatGPT's ascent highlights AI's potential to emulate human-like language understanding. With ongoing refinement, the future promises versatile conversational AI that bridges artificial intelligence and language's artistry, fostering human-AI understanding.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc.

0
0
644

article-image-generative-fill-with-adobe-firefly-part-ii

Joseph Labrecque

24 Aug 2023

9 min read

Generative Fill with Adobe Firefly (Part II)

Joseph Labrecque

24 Aug 2023

9 min read

Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe FireflyFor more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive Generative Fill with Adobe Firefly (Part I)This is the conclusion of a two-part article. You can catch up by reading Generative Fill with Adobe Firefly (Part I). In this article, we’ll continue our exploration of Firefly with the Generative fill module by looking at how to use the Insert and Replace features… and more.Generative Fill – Part I RecapIn part I of our Firefly Generative fill exploration, we uploaded a photograph of a cat, Poe, to the AI and began working with the various tools to remove the background and replace it with prompt-based generative AI content.Image 2: The original photograph of PoeNote that the original photograph includes a set of electric outlets exposed within the wall. When we remove the background, Firefly recognizes that these objects are distinct from the general background and so retains them.Image 3: A set of backgrounds is generated for us to choose fromYou can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph.Again, if you’d like to view these processes in detail – check out Generative Fill with Adobe Firefly (Part I).Insert and Replace with Generative FillWe covered generating a background for our image in part I of this article. Now we will focus on other aspects of Firefly Generative fill, including the Remove and Insert tools.Consider the image above and note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background in part I, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background… but we should remove them.This is where the Remove tool comes into play.Image 4: The Remove toolSwitching to the Remove tool will allow you to brush over an area of the photograph you’d like to remove. It fills in the removed area with pixels generated by the AI to create seamless removal.1. Select the Remove tool now. Note that when switching between the Insert and Remove tools, you will often encounter a save prompt as seen below. If there are no changes to save, this prompt will not appear!Image 5: When you switch tools… you may be asked to save your work2. Simply click the Save button to continue – as choosing the Cancel button will halt the tool selection.3. With the Remove tool selected, you can adjust the Brush Settings from the toolbar below the image, at the bottom of the screen.Image 6: The Brush Settings overlay4. Zoom in closer to the wall outlet and brush over the area by clicking and dragging with your mouse. The size of your brush, depending upon brush settings, will appear as a circular outline. You can change the size of the brush by tapping the [ or] keys on your keyboard.Image 7: Brushing over the wall outlet with the Remove tool5. Once you are happy with the selection you’ve made, click the Remove button within the toolbar at the bottom of the screen.Image 8: The Remove button appears within the toolbar6. The Firefly AI uses Generative fill to replace the brushed-over area with new content based upon the surrounding pixels. A set of four variations appears below the photograph. Click on each one to preview – as they can vary quite a bit.Image 9: Selecting a fill variant7. Klick the Keep button in the toolbar to save your selection and continue editing. Remember – if you attempt to switch tools before saving… Firefly will prompt you to save your edits via a small overlay prompt.The outlet has now been removed and the wall is all patched up.Aside from the removal of objects through Generative fill, we can also perform insertions based on text prompts. Let’s add some additional elements to our photograph using these methods. 1. Select the Insert tool from the left-hand toolbar.2. Use it in a similar way as we did the Remove tool to brush in a selection of the image. In this case, we’ll add a crown to Poe’s head – so brush in an area that contains the top of his head and some space above it. Try and visualize a crown shape as you do this.3. In the prompt input that appears beneath the photograph, type in a descriptive text prompt similar to the following: “regal crown with many jewels”Image 10: A selection is made, and a text prompt inserted4. Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 11: Poe is a regal cat5. A crown is generated in accordance with our text prompt and the surrounding area. A set of four variations to choose from appears as well. Note how integrated they appear against the original photographic content.6. Click the Keep button to commit and save your crown selection.7. Let’s add a scepter as well. Brush the general form of a scepter across Poe’s body extending from his paws to his shoulder.8. Type in the text prompt: “royal scepter”Image 12: Brushing in a scepter shape9. Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 13: Poe now holds a regal scepter in addition to his crown10. Remember to choose a scepter variant and click the Keep button to commit and save your scepter selection.Okay! That should be enough regalia to satisfy Poe. Let’s download our creation for distribution or use in other software.Downloading your ImageClick the Download button in the upper right of the screen to begin the download process for your image.Image 14: The Download buttonAs Firefly begins preparing the image for download, a small overlay dialog appears.Image 15: Content credentials are applied to the image as it is downloadedFirefly applies metadata to any generated image in the form of content credentials and the image download process begins.Once the image is downloaded, it can be viewed and shared just like any other image file.Image 16: The final image from our exploration of Generative fillAlong with content credentials, a small badge is placed upon the lower right of the image which visually identifies the image as having been produced with Adobe Firefly.That concludes our set of articles on using Generative fill to remove and insert objects into your images using the Adobe Firefly AI. We have a number of additional articles on Firefly procedures on the way… including Generative recolor for vector artwork!Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC, a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
667

article-image-generative-fill-with-adobe-firefly-part-i

Joseph Labrecque

24 Aug 2023

8 min read

Generative Fill with Adobe Firefly (Part I)

Joseph Labrecque

24 Aug 2023

8 min read

Adobe Firefly AI Overview Adobe Firefly is a new set of generative AI tools that can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe Firefly For more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive In the next two articles, we’ll continue our exploration of Firefly with the Generative fill module. We’ll begin with an overview of accessing Generative fill from a generated image and then explore how to use the module on our own personal images. Recall from a previous article Exploring Text to Image with Adobe Firefly that when you hover your mouse cursor over a generated image – overlay controls will appear. Image 2: Generative fill overlay control from Text to image One of the controls in the upper right of the image frame will invoke the Generative fill module and pass the generated image into that view. Image 3: The generated image is sent to the Generative fill module Within the Generative fill module, you can use any of the tools and workflows that are available when invoking Generative fill from the Firefly website. The only difference is that you are passing in a generated image rather than uploading an image from your local hard drive. Keep this in mind as we continue to explore the basics of Generative fill in Firefly – as we’ll begin the process from scratch. Generative Fill When you first enter the Firefly web experience, you will be presented with the various workflows available. These appear as UI cards and present a sample image, the name of the procedure, a procedure description, and either a button to begin the process or a label stating that it is “in exploration”. Those which are in exploration are not yet available to general users. We want to locate the Generative fill module and click the Generate button to enter the experience. Image 4: The Generative fill module card From there, you’ll be taken to a view that prompts you to upload an image into the module. Firefly also presents a set of sample images you can load into the experience. Image 5: Generative fill getting started promptly Clicking the Upload image button summons a file browser for you to locate the file you want to use Generative fill on. In my example, I’ll be using a photograph of my cat, Poe. You can download the photograph of Poe [[ NOTE – LINK TO FILE Poe.jpg ]] to work with as well. Image 6: The photograph of Poe, a cat Once the image file has been uploaded into Firefly, you will be taken to the Generative fill user experience and the photograph will be visible. Note that this is exactly the same experience as when entering Generative fill from a prompt-generated image as we saw above. The only real difference is how we get to this point. Image 7: The photograph is loaded into Generative fill You will note that there are two sets of tools available within the experience. One set is along the left side of the screen and includes Insert, Remove, and Pan tools. Image 8: Insert, Remove, and Pan Switching between the Insert and Remove tools changes the function of the current process. The Pan tool allows you to pan the image around the view. Along the bottom of the screen is the second set of tools – which are focused on selections. This set contains the Add and Subtract tools, access to Brush Settings, a Background removal process, and a selection Invert toggle. Image 9: Add, Subtract, Brush Settings, Background removal, and selection Invert Let’s perform some Generative fill work on the photograph of Poe. In the larger overlay along the bottom of the view, locate and click the Background option. This is an automated process that will detect and remove the background from the image loaded into Firefly. Image 10: The background is removed from the selected photograph 2. A prompt input appears directly beneath the photograph. Type in the following prompt: “a quiet jungle at night with lots of mist and moonlight” Image 11: Entering a prompt into the prompt input control 3. If desired, you can view and adjust the settings for the generative AI by clicking the Settings icon in the prompt input control. This summons the Settings overlay. Image 12: The generative AI Settings overlay Within the Settings overlay, you will find there are three items that can be adjusted to influence the AI: Match shape: You have two choices here – freeform or conform. Preserve content: A slider that can be set to include more of the original content or produce new content. Guidance strength: A slider that can be set to provide more strength to the original image or the given prompt. I suggest leaving these at the default setting for now. 4. Click the Settings icon again to dismiss the overlay. 5. Click the Generate button to generate a background based upon the entered prompt. A new background is generated from our prompt, and it now appears as though Poe is visiting a lush jungle at night. Image 13: Poe enjoying the jungle at night Note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background and has interestingly propped them up with a couple of sticks. It also has gone through and rendered a realistic shadow cast by Poe. Before moving on, click the Cancel button to bring the transparent background back. Clicking the Keep button will commit the changes – and we do not want that as we wish to continue exploring other options. Clear out the prompt you previously wrote within the prompt input control so that there is no longer any prompt present. Image 14: Click the Generate button with no prompt present 3. Click the Generate button without a text prompt in place. The photograph receives a different background from the one generated with a text prompt. When clicking the Generate button with no text prompt, you are basically allowing the Firefly AI to make all the decisions based solely on the visual properties of the image. Image 15: A set of backgrounds is generated based on the remaining pixels present You can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph. If you’d like Firefly to generate more variations – click the More button. Select the one you like best and click the Keep button. Okay! That’s pretty good but we are not done with Generative fill yet. We haven’t even touched the Insert and Remove functions… and there are Brush Settings to manipulate… and much more. In the next article, we’ll explore the remaining Generative fill tools and options to further manipulate the photograph of Poe. Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
791

article-image-adobe-firefly-feature-deep-dive

Joseph Labrecque

23 Aug 2023

9 min read

Adobe Firefly Feature Deep Dive

Joseph Labrecque

23 Aug 2023

9 min read

Adobe FireflyAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe FireflyFor more information about Firefly, have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe FireflyIn this article, we’ll be exploring some of the more detailed features of Firefly in general. While we will be doing so from the perspective of the text-to-image module, much of what we cover will be applicable to other modules and procedures as well.Before moving on to the visual controls and options… let’s consider accessibility. Here is what Adobe has to say about accessibility within Firefly:Firefly is committed to providing accessible and inclusive features to all individuals, including users working with assistive devices such as speech recognition software and screen readers. Firefly is continuously enhanced to strive to meet the needs of all types of users, including individuals with visual, hearing, cognitive, motor, or other impairments, and is designed to conform to worldwide accessibility standards. -- AdobeYou can use the following keyboard shortcuts across the Firefly interface to navigate and control the software in a non-visual way: Tab: navigates between user interface controls. Space/Enter: activates buttons. Enter: activates links. Arrow Keys: navigates between options. Space: selects options.As with most accessibility concerns and practices, these additional controls within Firefly can benefit those users who are not otherwise impaired as well – similar to sight-enabled users making use of captions when watching video-based content.For our exploration of the various additional controls and options within Firefly, we’ll start off with a generated set of images based on a prompt. To review how to achieve this, have a look at the article “Exploring Text to Image with Adobe Firefly”.Choose one of the generated images to work with and hover your mouse across the image to reveal a set of controls.Image 2: Image Overlay OptionsWe will explore each of these options one by one as we continue along with this article.Rating and Feedback OptionsAdobe is very open to feedback with Firefly. One reason is to get general user feedback to improve the experience of using the product… and the other is to influence the generative models so that users receive the output that is expected.Giving a simple thumbs-up or thumbs-down is the most basic level of feedback and is meant to rate the results of your prompt.Image 3: Rating the generated resultsOnce you provide a thumbs-up or thumbs-down… the overlay changes to request additional feedback. You don’t necessarily need to provide more feedback – but clicking on the Feedback button will allow you to go more in-depth in terms of why you provided the initial rating.Image 4: Additional feedback promptClicking the Feedback button will summon a much larger overlay where you can make choices via a checkbox as to why you rated the results the way you did. You also have the option to put a little note in here as well.Image 5: Additional feedback formClicking the Submit Feedback button or the Cancel button will close the overlay and bring you back to the experience.Additionally, there is an option to Report the image to Adobe. This is always a negative action – meaning that you find the results offensive or inappropriate in some way.Image 6: Report promptClicking on the Report option will summon a similar form to that of additional feedback, but the options will, of course, be different.Image 7: Report feedback formHere, you can report via a checkbox and add an optional note as part of the report. Adobe has committed to making sure that violence and things like copyrighted or trademarked characters and such are not generated by Firefly.For instance, if you use a prompt such as “Micky Mouse murdering a construction worker with a chainsaw”… you will receive a message like the following:Image 8: Firefly will not render trademarked characters or violenceWith Adobe is being massively careful in filtering certain words right now… I do hope in the future that users will be able to selectively choose exclusions in place of a general list of censored terms as exists now. While the prompt above is meant to be absurd – there are legitimate artistic reasons for many of the word categories which are currently banned.General Image ControlsThe controls in this section include some of the most used in Firefly at the moment – including the ability to download your generated image.Image 9: Image optionsWe have the following controls exposed, from left to right they are named: Options Download FavoriteOptionsStarting at the left-hand side of this group of controls, we begin with an ellipse that represents Options which, when clicked, will summon a small overlay with additional choices.Image 10: Expanded optionsThe menu that appears includes the following items:1. Submit to Firefly gallery2. Use as a reference image3. Copy to the clipboardLet’s examine each of these in detail.You may have noticed that the main navigation of the Firefly website includes a number of options: Home, Gallery, Favorites, About, and FAQ. The Gallery section contains generated images that users have submitted to be featured on this page.Clicking the Submit to Firefly gallery option will summon a submission overlay through which you can request that your image is included in the Gallery.Image 11: Firefly Gallery submissionSimply read over the details and click Continue or Cancel to return.The second item, Use as reference image, brings up a small overlay that includes the selected image to use as a reference along with a strength slider.Image 12: Reference image sliderMoving the slider to the left will favor the reference image and moving it to the right will favor the raw prompt instead. You must click the Generate button after adjusting the slider to see its effect.The final option is Copy to clipboard – which does exactly as you’d expect. Note that Content Credentials are applied in this case just the same as they are when downloading an image. You can read more about this feature in the Firefly FAQ.DownloadBack up to the set of three controls, the middle option allows you to initiate a Download of the selected image. As Firefly begins preparing the image for download, a small overlay dialog appears.Image 13: Download applies content credentials – similar to the Copy to clipboard optionFirefly applies metadata to any generated image in the form of content credentials and the image download process begins. We’ve covered exactly what this means in previous articles. The image is then downloaded to your local file system.FavoriteClicking the Favorite control will add the generated image to your Firefly Favorites so that you can return to the generated set of images for further manipulation or to download later on.Image 14: Adding a favoriteThe Favorite control works as a toggle. Once you declare a favorite, the heart icon will appear filled and the control will allow you to un-favorite the selected image instead.That covers the main set of controls which overlay the right of your image – but there is a smaller set of controls on the left that we must explore as well.Additional Manipulation OptionsThe alternative set of controls numbers only two – but they are both very powerful. To the left is the Show similar control and to the right is Generative fill.Image 15: Show similar and Generative fill controlsClicking upon the Show similar control will retain the particular, chosen image while regenerating the other three to be more in conformity with the image specified.Image 16: Show similar will refresh the other three imagesAs you can see when comparing the sets of images in the figures above and below… you can have great influence over your set of generated images through this control.Image 17: The original image stays the sameThe final control we will examine in this article is Generative fill. It is located right next to the Show similar control.The generative fill view presents us with a separate view and a number of all-new tools for making selections in order to add or remove content from our images.Image 18: Generative fill brings you to a different view altogetherGenerative fill is actually its own proper procedure in Adobe Firefly… and we’ll explore how to use this feature in full - in the next article! Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
238

article-image-generative-recolor-with-adobe-firefly

Joseph Labrecque

23 Aug 2023

10 min read

Generative Recolor with Adobe Firefly

Joseph Labrecque

23 Aug 2023

10 min read

Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ:Image 1: Adobe FireflyFor more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive Generative Fill with Adobe Firefly (Part I) Generative Fill with Adobe Firefly (Part II)This current Firefly article will focus on a unique use of AI prompts via the Generative recolor module.Generative Recolor and SVGWhile most procedures in Firefly are focused on generating imagery through text prompts, the service also includes modules that use prompt-driven AI a bit differently. The subject of this article, Generative recolor, is a perfect example of this.Generative recolor works with vector artwork in the form of SVG files. If you are unfamiliar with SVG, it stands for Scalable Vector Graphics and is an XML… so uses text-based nodes similar to HTML:Image 2: An SVG file is composed of vector information defining points, paths, and colorsAs the name indicates, we are dealing with vector graphics here and not photographic pixel-based bitmap images. Vectors are often used for artwork, logos, and such – as they can be infinitely scaled and easily recolored.One of the best ways of generating SVG files is by designing them in a vector-based design tool like Adobe Illustrator. Once you have finished designing your artwork, you’ll save it as SVG for use in Firefly:Image 3: Cat artwork designed in Adobe IllustratorTo convert your Illustrator artwork to SVG, perform the following steps:1. Choose File > Save As to open the save as dialog.2. Choose SVG (svg) for the file format:Image 3: Selecting SVG (svg) as the file format3. Browse to the location on your computer you would like to save the file to.4. Click the Save button.You now have an SVG file ready to recolor within Firefly. If you desire, you can download the provided cat.svg file that we will work on in this article. Recolor Vector Artwork with Generative RecolorGenerative recolor, like all Firefly modules, can be found directly at https://firefly.adobe.com/ so long as you are logged in with your Adobe ID.From the main Firefly page, you will find a number of modules for different AI-driven tasks:Image 4: Locate the Generative recolor Firefly moduleLet’s explore Generative recolor in Firefly:1. You’ll want to locate the module named Generative recolor.2. Click the Generate button to get started.You are taken to an intermediate view where you are able to upload your chosen SVG file for the purposes of vector recolor based upon a descriptive text prompt:Image 5: The Upload SVG button prompt appears, along with sample files3. Click the Upload SVG button and choose cat.svg from your file system. Of course, you can use any SVG file you want if you have another in mind. If you do not have an SVG file you’d like to use, you can click on any of the samples presented below the Upload SVG button to load one up into the module.The SVG is uploaded and a control appears which displays a preview of your file along with a text input where you can write a short text prompt describing the color palette you’d like to generate:Image 6: The Generative recolor input requests a text prompt4. Think of some descriptive words for an interesting color palette and type them into the text input. I’ll input the following simple prompt for this demonstration: “northern lights”.5. Click the Generate button when ready.You are taken into the primary Generative recolor user interface experience and a set of four color variants is immediately available for preview:Image 7: The Firefly Generative recolor user interfaceThe interface appears similar to what you might have seen in other Firefly modules – but there are some key differences here, since we are dealing with recoloring vector artwork.The larger, left-most area contains a set of four recolor variants to choose from. Below this is the prompt input area which displays the current text prompt and a Refresh button that allows the generation of additional variants when the prompt is updated. To the right of this area are presented various additional options within a clean user interface that scrolls vertically. Let’s explore these from top to bottom.The first thing you’ll see is a thumbnail of your original artwork with the ability to replace the present SVG with a new file:Image 8: You can replace your artwork by uploading a new SVG fileDirectly below this, you will find a set of sample prompts that can be applied to your artwork:Image 9: Sample prompts can provide immediate resultsClicking upon any of these thumbnails will immediately overwrite the existing prompt and cause a refresh – generating a new set of four recolor variants.Next, is a dropdown selection which allows the choice of color harmony:Image 10: A number of color harmonies are availableChoosing to align the recolor prompt with a color harmony will impact which colors are being used based off a combination of the raw prompt – guided by harmonization rules. An indicator will be added along with the text prompt.For more information about color and color harmonies, check out Understanding color: A visual guide – from Adobe.Below is a set of eighteen color swatches to choose from:Image 11: Color chips can add bias to your text promptClicking on any of these swatches will add that color to the options below your text prompt to help guide the recolor process. You can select one or many of these swatches to use.Finally, at the very bottom of this area is a toggle switch that allows you to either preserve black and white colors in your artwork or to recolor them just like any other color:Image 12: You can choose to preserve black and white during a recolor session or notThat is everything along the right-hand side of the interface. We’ll return to this area shortly – but for now… let’s see the options that appear when hovering the mouse cursor over any of the four recolor variants:Image 13: The Generative recolor overlayHovering over a recolor variant will reveal a number of options: Prominent colors: Displays the colors used in this recolor variant. Shuffle colors: Will use the same colors… but distribute them differently across the vector artwork. Options: Copy to clipboard is the only option that is available via this menu. Download: Enables the download of this particular recolor variant. Rate this result: Provide a positive or negative rating of this result.We’ll make use of the Download option in a bit – but first… let’s make use of some of the choices present in the right side panel to modify and guide our recolor.Modifying the PromptYou can always change the text prompt however you wish and click the Refresh button to generate a different set of variants. Let’s instead keep this same text prompt but see how various choices can impact how it affects the recolor results:Image 14: A modified prompt box with options addedFocus again on the right side of the user interface and make the following selections:1. Select a color harmony: Complementary2. Choose a couple of colors to weight the prompt: Green and Blue violet3. Disable the Preserve black and white toggle4. Click the Refresh button to see the results of these optionsA new set of four recolor variants is produced. This set of variants is guided by the extra choices we made and is vastly different from the original set which was recolored solely based upon the text prompt:Image 15: A new set of recolor variations is generatedPlay with the various options on your own to see what kind of variations you can achieve in the artwork.Downloading your Recolored ArtworkOnce you are happy with one of the generated recolored variants, you’ll want to download it for use elsewhere. Click the Download button in the upper right of the selected variant to begin the download process for your recolored SVG file.The recolored SVG file is immediately downloaded to your computer. Note that unlike other content generated with Firefly, files created with Generative recolor do not contain a Firefly watermark or badge:Image 17: The resulting recolored SVG fileThat’s all there is to it! You can continue creating more recolor variants and freely download any that you find particularly interesting.Before we conclude… note that another good use for Generative recolor – similar to most applications of AI – is for ideation. If you are stuck with a creative block when trying to decide on a color palette for something you are designing… Firefly can help kick-start that process for you.Author BioJoseph is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph Labrecque is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
375

article-image-adobe-firefly-integrations-in-illustrator-and-photoshop

Joseph Labrecque

23 Aug 2023

12 min read

Adobe Firefly Integrations in Illustrator and Photoshop

Joseph Labrecque

23 Aug 2023

12 min read

Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ:Image 1: Adobe FireflyFor more information around the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive Generative Fill with Adobe Firefly (Part I) Generative Fill with Adobe Firefly (Part II) Generative Recolor with Adobe Firefly Adobe Firefly and Express (beta) IntegrationThis current Firefly article will focus on Firefly integrations within the release version of Adobe Illustrator and the public beta version of Adobe Photoshop.Firefly in Adobe IllustratorVersion 27.7 is the most current release of Illustrator at the writing of this article and this version contains Firefly integrations in the form of Generative Recolor (Beta).To access this, design any vector artwork within Illustrator or open existing artwork to get started. I’m using the cat.ai file that was used to generate the cat.svg file used in the Generative Recolor with Adobe Firefly article:Image 2: The cat vector artwork with original colors1. Select the artwork you would like to recolor. Artwork must be selected for this to work.2. Look to the Properties panel and locate the Quick Actions at the bottom of the panel. Click the Recolor quick action:Image 3: Choosing the Recolor Quick action3. By default, the Recolor overlay will open with the Recolor tab active. Switch to the Generative Recolor (Beta) tab to activate it instead:Image 4: The Generative Recolor (Beta) view4. You are invited to enter a prompt. I’ve written “northern lights green and vivid neon” as my prompt that describes colors I’d like to see. There are also sample prompts you can click on below the prompt input box.5. Click the Generate button once a prompt has been entered:Image 5: Selecting a Recolor variantA set of recolor variants is presented within the overlay. Clicking on any of these will recolor your existing artwork according to the variant look:Image 6: Adding a specific color swatchIf you would like to provide even more guidance, you can modify the prompt and even add specific color swatches you’d like to see included in the recolored artwork.That’s it for Illustrator – very straightforward and easy to use!Firefly in Adobe Photoshop (beta)Generative Fill through Firefly is also making its way into Photoshop. While within Illustrator – we have Firefly as part of the current version of the software, albeit with a beta label on the feature, with Photoshop things are a bit different:Image 7: Generative Fill is only available in the Photoshop public betaTo make use of Firefly within Photoshop, the current release version will not cut it. You will need to install the public beta from the Creative Cloud Desktop application in order to access these features.With that in mind, let’s use Generative Fill in the Photoshop public beta to expand a photograph beyond its bounds and add in additional objects.1. First, open a photograph in the Photoshop public beta. I’m using the Poe.jpg photograph that we previously used in the articles Generative Fill with Adobe Firefly (Parts I & II):Image 8: The original photograph in Photoshop2. With the photograph open, we’ll add some extra space to the canvas to generate additional content and expand the image beyond its bounds. Summon the Canvas Size dialog by choosing Image > Canvas Size… from the application menu.3. Change both the width and height values to 200 Percent:Image 9: Expanding the size of the canvas4. Click the OK button to close the dialog and apply the change.The original canvas is expanded to 200 percent of its original size while the image itself remains exactly the same:Image 10: The photograph with an expanded canvasGenerative Fill, when used in this manner to expand an image, works best by selecting portions to expand bit by bit rather than all the expanded areas at once. It is also beneficial to select parts of the original image you want to expand from. This feeds and directs the Firefly AI.5. Using the Rectangular Marquee tool, make such a selection across either the top, bottom, left, or right portions of the document:Image 11: Making a selection for Generative Fill6. With a selection established, click Generative Fill within the contextual toolbar:Image 12: Leaving the prompt blank allows Photoshop to make all the decisions7. The contextual toolbar will now display a text input where you can enter a prompt to guide the process. However, in this case, we want to simply expand the image based upon the original pixels selected – so we will leave this blank with no prompt whatsoever. Click Generate to continue.8. The AI processes the image and displays a set of variants to choose from within the Properties panel. Click the one that conforms closest to the imagery you are looking to produce and that is what will be used upon the canvas:Image 13: Choosing a Generative Fill variantNote that if you look to the Layers panel, you will find a new layer type has been created and added to the document layer stack:Image 14: Generative Layers are a new layer type in PhotoshopThe Generative Layer retains both the given prompt and variants so that you can continue to make changes and adjustments as needed – even following this specific point in time.The resulting expansion of the original image as performed by Generative Fill can be very convincing! As mentioned before, this often works best by performing fills in a piece-by-piece patchwork manner:Image 15: The photograph with a variant applied across the selectionContinue selecting portions of the image using the Rectangular Marquee tool (or any selection tools, really) and generate new content the same way we have done so already – without supplying any text prompt to the AI:Image 16: The photograph with all expanded areas filled via generative AIEventually, you will complete the expansion of the original image and produce a very convincing deception.Of course, you can also guide the AI with actual text prompts. Let’s add in an object to the image as a demonstration.1. Using the Lasso tool (or again… any selection tool), make a selection across the image in the form of what might hold a standing lamp of some sort:Image 17: Making an additional selection2. With a selection established, click Generative Fill within the contextual toolbar.3. Type in a prompt that describes the object you want to generate. I will use the prompt “tall rustic wooden and metal lamp”.4. Click the Generate button to process the Generative Fill request:Image 18: A lamp is generated from our selection and text promptA set of generated lamp variants are established within the Properties panel. Choose the one you like the most and it will be applied within the image.You will want to be careful with how many Generative Layers are produced as you work on any single document. Keep an eye on the Layers panel as you work:Image 19: Each Generative Fill process produces a new layerEach time you use Generative Fill within Photoshop, a new Generative Layer is produced.Depending upon the resources and capabilities of your computer… this might become burdensome as everything becomes more and more complex. You can always flatten your layers to a single pixel layer if this occurs to free up additional resources.That concludes our overview of Generative Fill in the Photoshop public beta!Ethical Concerns with Generative AII want to make one additional note before concluding this series and that has to do with the ethics of generative AI. This concern goes beyond Adobe Firefly specifically – as it could be argued that Firefly is the least problematic and most ethical implementation of generative AI that is available today.See https://firefly.adobe.com/faq for additional details on steps Adobe has taken to ensure responsible AI through their use of Adobe Stock content to train their models, through the use of Content Credentials, and more...Like all our AI capabilities, Firefly is developed and deployed around our AI ethics principles of accountability, responsibility, and transparency.Data collection: We train our model by collecting diverse image datasets, which have been curated and preprocessed to mitigate against harmful or biased content. We also recognize and respect artists’ ownership and intellectual property rights. This helps us build datasets that are diverse, ethical, and respectful toward our customers and our community.Addressing bias and testing for safety and harm: It’s important to us to create a model that respects our customers and aligns with our company values. In addition to training on inclusive datasets, we continually test our model to mitigate against perpetuating harmful stereotypes. We use a range of techniques, including ongoing automated testing and human evaluation.Regular updates and improvements: This is an ongoing process. We will regularly update Firefly to improve its performance and mitigate harmful bias in its output. We also provide feedback mechanisms for our users to report potentially biased outputs or provide suggestions into our testing and development processes. We are committed to working together with our customers to continue to make our model better.-- AdobeI have had discussions with a number of fellow educators about the ethical use of generative AI and Firefly in general. Here are some paraphrased takeaways to consider as we conclude this article series: “We must train the new generations in the respect and proper use of images or all kinds of creative work.” “I don't think Ai can capture that sensitive world that we carry as human beings.” “As dire as some aspects of all of this are, I see opportunities.” “Thousands of working artists had their life's work unknowingly used to create these images.” “Professionals will be challenged, truly, by all of this, but somewhere in that process I believe we will find our space.” “AI data expropriations are a form of digital colonialism.” “For many students, the notion of developing genuine skill seems pointless now.” “Even for masters of the craft, it’s dispiriting to see someone type 10 words and get something akin to what took them 10 years.”I’ve been using generative AI for a few years now and can appreciate and understand the concerns expressed above - but also recognize that this technology is not going away. We must do what we can to address the ethical concerns brought up here and make sure to use our awareness of these problematic issues to further guide the direction of these technologies as we rapidly advance forward. These are very challenging times, right now. Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph Labrecque is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
773

article-image-getting-started-with-aws-codewhisperer

Rohan Chikorde

23 Aug 2023

11 min read

Getting Started with AWS CodeWhisperer

Rohan Chikorde

23 Aug 2023

11 min read

IntroductionEfficiently writing secure, high-quality code within tight deadlines remains a constant challenge in today's fast-paced software development landscape. Developers often face repetitive tasks, code snippet searches, and the need to adhere to best practices across various programming languages and frameworks. However, AWS CodeWhisperer, an innovative AI-powered coding companion, aims to transform the way developers work. In this blog, we will explore the extensive features, benefits, and setup process of AWS CodeWhisperer, providing detailed insights and examples for technical professionals.At its core, CodeWhisperer leverages machine learning and natural language processing to deliver real-time code suggestions and streamline the development workflow. Seamlessly integrated with popular IDEs such as Visual Studio Code, IntelliJ IDEA, and AWS Cloud9, CodeWhisperer enables developers to remain focused and productive within their preferred coding environment. By eliminating the need to switch between tools and external resources, CodeWhisperer accelerates coding tasks and enhances overall productivity.A standout feature of CodeWhisperer is its ability to generate code from natural language comments. Developers can now write plain English comments describing a specific task, and CodeWhisperer automatically analyses the comment, identifies relevant cloud services and libraries, and generates code snippets directly within the IDE. This not only saves time but also allows developers to concentrate on solving business problems rather than getting entangled in mundane coding tasks.In addition to code generation, CodeWhisperer offers advanced features such as real-time code completion, intelligent refactoring suggestions, and error detection. By analyzing code patterns, industry best practices, and a vast code repository, CodeWhisperer provides contextually relevant and intelligent suggestions. Its versatility extends to multiple programming languages, including Python, Java, JavaScript, TypeScript, C#, Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala, making it a valuable tool for developers across various language stacks.AWS CodeWhisperer addresses the need for developer productivity tools by streamlining the coding process and enhancing efficiency. With its AI-driven capabilities, CodeWhisperer empowers developers to write clean, efficient, and high-quality code. By supporting a wide range of programming languages and integrating with popular IDEs, CodeWhisperer caters to diverse development scenarios and enables developers to unlock their full potential. Embrace the power of AWS CodeWhisperer and experience a new level of productivity and coding efficiency in your development journey.Key Features and Benefits of CodeWhisperer A. Real-time code suggestions and completionCodeWhisperer provides developers with real-time code suggestions and completion, significantly enhancing their coding experience. As developers write code, CodeWhisperer's AI-powered engine analyzes the context and provides intelligent suggestions for function names, variable declarations, method invocations, and more. This feature helps developers write code faster, with fewer errors, and improves overall code quality. By eliminating the need to constantly refer to documentation or search for code examples, CodeWhisperer streamlines the coding process and boosts productivity.B. Intelligent code generation from natural language commentsOne of the standout features of CodeWhisperer is its ability to generate code snippets from natural language comments. Developers can simply write plain English comments describing a specific task, and CodeWhisperer automatically understands the intent and generates the corresponding code. This powerful capability saves developers time and effort, as they can focus on articulating their requirements in natural language rather than diving into the details of code implementation. With CodeWhisperer, developers can easily translate their high-level concepts into working code, making the development process more intuitive and efficient.C. Streamlining routine or time-consuming tasksCodeWhisperer excels at automating routine or time-consuming tasks that developers often encounter during the development process. From file manipulation and data processing to API integrations and unit test creation, CodeWhisperer provides ready-to-use code snippets that accelerate these tasks. By leveraging CodeWhisperer's automated code generation capabilities, developers can focus on higher-level problem-solving and innovation, rather than getting caught up in repetitive coding tasks. This streamlining of routine tasks allows developers to work more efficiently and deliver results faster.D. Leveraging AWS APIs and best practicesAs an AWS service, CodeWhisperer is specifically designed to assist developers in leveraging the power of AWS services and best practices. It provides code recommendations tailored to AWS application programming interfaces (APIs), allowing developers to efficiently interact with services such as Amazon EC2, Lambda, and Amazon S3. CodeWhisperer ensures that developers follow AWS best practices by providing code snippets that adhere to security measures, performance optimizations, and scalability considerations. By integrating AWS expertise directly into the coding process, CodeWhisperer empowers developers to build robust and reliable applications on the AWS platform.E. Enhanced security scanning and vulnerability detectionSecurity is a top priority in software development, and CodeWhisperer offers enhanced security scanning and vulnerability detection capabilities. It automatically scans both generated and developer-written code to identify potential security vulnerabilities. By leveraging industry-standard security guidelines and knowledge, CodeWhisperer helps developers identify and remediate security issues early in the development process. This proactive approach to security ensures that code is written with security in mind, reducing the risk of vulnerabilities and strengthening the overall security posture of applications.F. Responsible AI practices to address bias and open-source usageAWS CodeWhisperer is committed to responsible AI practices and addresses potential bias and open-source usage concerns. The AI models behind CodeWhisperer are trained on vast amounts of publicly available code, ensuring accuracy and relevance in code recommendations. However, CodeWhisperer goes beyond accuracy and actively filters out biased or unfair code recommendations, promoting inclusive coding practices. Additionally, it provides reference tracking to identify code recommendations that resemble specific open source training data, allowing developers to make informed decisions and attribute sources appropriately. By focusing on responsible AI practices, CodeWhisperer ensures that developers can trust the code suggestions and recommendations it provides.Setting up CodeWhisperer for individual developersIf you are an individual developer who has acquired CodeWhisperer independently and will be using AWS Builder ID for login, follow these steps to access CodeWhisperer from your JetBrains IDE:1. Ensure that the AWS Toolkit for JetBrains is installed. If it is not already installed, you can install it from the JetBrains plugin marketplace.2. In your JetBrains IDE, navigate to the edge of the window and click on the AWS Toolkit icon. This will open the AWS Toolkit for the JetBrains panel:3. Within the AWS Toolkit for JetBrains panel, click on the Developer Tools tab. This will open the Developer Tools Explorer.4. In the Developer Tools Explorer, locate the CodeWhisperer section and expand it. Then, select the "Start" option:5. A pop-up window titled "CodeWhisperer: Add a Connection to AWS" will appear. In this window, choose the "Use a personal email to sign up" option to sign in with your AWS Builder ID.6. Once you have entered your personal email associated with your AWS Builder ID, click on the "Connect" button to establish the connection and access CodeWhisperer within your JetBrains IDE:7. A pop-up titled "Sign in with AWS Builder ID" will appear. Select the "Open and Copy Code" option.8. A new browser tab will open, displaying the "Authorize request" window. The copied code should already be in your clipboard. Paste the code into the appropriate field and click "Next."9. Another browser tab will open, directing you to the "Create AWS Builder ID" page. Enter your email address and click "Next." A field for your name will appear. Enter your name and click "Next." AWS will send a confirmation code to the email address you provided.10. On the email verification screen, enter the code and click "Verify." On the "Choose your password" screen, enter a password, confirm it, and click "Create AWS Builder ID." A new browser tab will open, asking for your permission to allow JetBrains to access your data. Click "Allow."11. Another browser tab will open, asking if you want to grant access to the AWS Toolkit for JetBrains to access your data. If you agree, click "Allow."12. Return to your JetBrains IDE to continue the setup process. CodeWhisperer in ActionExample Use Case: Automating Unit Test Generation with CodeWhisperer in Python (Credits: aws-solutions-library-samples):One of the powerful use cases of CodeWhisperer is its ability to automate the generation of unit test code. By leveraging natural language comments, CodeWhisperer can recommend unit test code that aligns with your implementation code. This feature significantly simplifies the process of writing repetitive unit test code and improves overall code coverage.To demonstrate this capability, let's walk through an example using Python in Visual Studio Code: Begin by opening an empty directory in your Visual Studio Code IDE. (Optional) In the terminal, create a new Python virtual environment:python3 -m venv .venvsource .venv/bin/activate Set up your Python environment and ensure that the necessary dependencies are installed.pip install pytest pytest-cov Create a new file in your preferred Python editor or IDE and name it "calculator.py". Add the following comment at the beginning of the file to indicate your intention to create a simple calculator class: # example Python class for a simple calculator Once you've added the comment, press the "Enter" key to proceed. CodeWhisperer will analyze your comment and start generating code suggestions based on the desired functionality. To accept the suggested code, simply press the "Tab" key in your editor or IDE. Picture Credit: aws-solutions-library-samplesIn case CodeWhisperer does not provide automatic suggestions, you can manually trigger CodeWhisperer to generate recommendations using the following keyboard shortcuts:For Windows/Linux users, press "Alt + C".For macOS users, press "Option + C".If you want to view additional suggestions, you can navigate through them by pressing the Right arrow key. On the other hand, to access previous suggestions, simply press the Left arrow key. If you wish to reject a recommendation, you can either press the ESC key or use the backspace/delete key.To continue building the calculator class, proceed by selecting the Enter key and accepting CodeWhisperer's suggestions, whether they are provided automatically or triggered manually. CodeWhisperer will propose basic functions for the calculator class, including add(), subtract(), multiply(), and divide(). In addition to these fundamental operations, it can also suggest more advanced functions like square(), cube(), and square_root().By following these steps, you can leverage CodeWhisperer to enhance your coding workflow and efficiently develop the calculator class, benefiting from a range of pre-generated functions tailored to your specific needs.ConclusionAWS CodeWhisperer is a groundbreaking tool that has the potential to revolutionize the way developers work. By harnessing the power of AI, CodeWhisperer provides real-time code suggestions and automates repetitive tasks, enabling developers to focus on solving core business problems. With seamless integration into popular IDEs and support for multiple programming languages, CodeWhisperer offers a comprehensive solution for developers across different domains. By leveraging CodeWhisperer's advanced features, developers can enhance their productivity, reduce errors, and ensure the delivery of high-quality code. As CodeWhisperer continues to evolve, it holds the promise of driving accelerated software development and fostering innovation in the developer community.Author BioRohan Chikorde is an accomplished AI Architect professional with a post-graduate in Machine Learning and Artificial Intelligence. With almost a decade of experience, he has successfully developed deep learning and machine learning models for various business applications. Rohan's expertise spans multiple domains, and he excels in programming languages such as R and Python, as well as analytics techniques like regression analysis and data mining. In addition to his technical prowess, he is an effective communicator, mentor, and team leader. Rohan's passion lies in machine learning, deep learning, and computer vision.LinkedIn

0
0
243

article-image-co-pilot-microsoft-fabric-for-power-bi

Sagar Lad

23 Aug 2023

8 min read

Co-Pilot & Microsoft Fabric for Power BI

Sagar Lad

23 Aug 2023

8 min read

IntroductionMicrosoft's data platform solution for the modern era is called Fabric. Microsoft's three primary data analytics tools: Power BI, Azure Data Factory, and Azure Synapse all covered under Fabric. Advanced artificial intelligence capabilities built on machine learning and natural language processing (NLP) are made available to Power BI customers through Copilot. In this article, we will deep dive into how co-pilot and Microsoft Fabric will transform the way we develop and work with Power BI.Co-Pilot and Fabric with Power BIThe urgent requirement for businesses to turn their data into value is something that both Microsoft Fabric and Copilot aspire to address. Big Data continues to fall short of its initial promises even after years have passed. Every year, businesses generate more data, yet a recent IBM study found that 90% of this data is never successfully exploited for any kind of strategic purpose. So, more data does not mean more value or business insight. Data fragmentation and poor data quality are the key obstacles to releasing the value of data. These problems are what Microsoft hopes to address with Microsoft Fabric, a human-centric, end-to-end analytics product that brings together all of an organization's data and analytics in one place. Copilot has now been integrated into Power BI. Large multi-modal artificial intelligence models based on natural language processing have gained attention since the publication of ChatGPT. Beyond casuistry, Microsoft Fabric and Copilot share a trait in that they each want to transform the Power BI user interface.● Microsoft Fabric and Power BIMicrosoft Fabric is just Synapse and Power BI together. By combining the benefits of the Power BI SaaS platform with the various Synapse workload types, Microsoft Fabric creates an environment that is more cohesive, integrated, and easier to use for all of the associated profiles. However, Power BI Premium users will get access to new opportunities for data science, data engineering, etc. Power BI will continue to function as it does right now. Data analysts and Power BI developers are not required to begin using Synapse Data Warehouse if they do not want to. Microsoft wants to combine all of its data offerings into one, called Fabric, just like it did with Office 365:Image 1: Microsoft Fabric (Source: Microsoft)Let’s understand in detail how Microsoft Fabric will make life easier for Power BI developers.1. Data IngestionThere are various methods by which we can connect to data sources in Fabric in order to consume data. For example, utilising Spark notebooks or pipelines, for instance. This may be unknown to the Power BI realm, though. Image 2: Data Transformation in Power BI Instead, we can ingest the data using dataflows gen2, which will save it on OneLake in the proper format.2. Ad Hoc Query One or more dataflows successfully published and refreshed will show in the workspace along with a number of other artifacts. The SQL Endpoint artifact is one of them. We can begin creating on-demand SQL queries and saving them as views after you open them. As an alternative, we can also create visual queries which will enable us to familiarise ourselves with the data flow diagram view. Above all, however, is the fact that this interface shares many characteristics with Power BI Data Marts, making it a familiar environment for those familiar with Power BI: Image 3: Power BI - One Data Lake Hub3. Data ModellingWith the introduction of web modelling for Power BI, we can introduce new metrics and start establishing linkages between different tables right away in this interface. The default workspace where the default dataset is kept will automatically contain the data model. The new storage option Direct Lake is advantageous for the datasets created in this manner via the cloud interface. By having just one copy of data in OneLake, this storage style prevents data duplication and unnecessary data refreshes.● Co-Pilot and Power BI Copilot, a new artificial intelligence framework for Power BI is an offering from Microsoft. CoPilot is Power BI's expensive multimodal artificial intelligence model that is built on natural language processing. It might be compared to the ChatGPT of Power BI. Users will be able to ask inquiries about data, generate graphics, and DAX measures by providing a brief description of what they need thanks to the addition of Copilot to Power BI. For instance, it demonstrates how a brief statement of the user's preferences for the report:"Add a table of the top 500 MNC IT Companies by total sales to my model”.The DAX code required to generate measures and tables is generated automatically by the algorithm.Copilot enables:● Power BI reports can be created and customized to provide insights.● Create and improve DAX computations.● Inquire about your data.● Publish narrative summaries.● Ease of Use● Faster Time to Market Key Features of the Power BI Copilot are as follows: ● Automated report generationCopilot can create well-designed dashboards, data narratives, and interactive components automatically, saving time and effort compared to manually creating reports.● Conversational language interfaceWe can use everyday language to express data requests and inquiries, making it simpler to connect with your data and gain insights. ● Real-time analyticsCopilot's real-time analytics capabilities can be used by Power BI customers to view data and react swiftly to shifts and trends. Let’s look at the step-by-step process on how to use Copilot for Power BI:Step 1: Open Power BI and go to the Copilot tab screen,Step 2: Type a query pertaining to the data for example to produce a financial report or pick from a list of suggestions that Copilot has automatically prepared for you.Step 3: Copilot sorts through and analyses data to provide the information.Step 4: Copilot compiles a visually stunning report, successfully converting complex data into easily comprehended, practical information.Step 5: Investigate data even more by posing queries, writing summaries to present to stakeholders, and more. There are also a few limitations to using the Copilot features with Power BI: ● Reliability for the recommendationsAll programming languages that are available in public sources have been taught to Copilot, ensuring the quality of its proposals. The quantity of the training dataset that is accessible for that language, however, may have an impact on the quality of the suggestions. APL, Erlang, and other specialized programming languages' suggestions won't be as useful as those for more widely used ones like Python, Java, etc.● Privacy and security issuesThere are worries that the model, which was trained on publicly accessible code, can unintentionally recommend code fragments that have security flaws or were intended to be private.● Dependence on comments and namingThe user is responsible for accuracy because the AI can provide suggestions that are more accurate when given specific comments and descriptive variable names.● Lack of original solutions and inability to creatively solve problems. Unlike a human developer, the tool is unable to do either. It can only make code suggestions based on the training data.● Inefficient codebaseThe tool is not designed for going through and comprehending big codebases. It works best when recommending code for straightforward tasks.ConclusionThe combination of Microsoft Copilot and Fabric with Power BI has the ability to completely alter the data modelling field. It blends sophisticated generative AI with data to speed up the discovery and sharing of insights by everyone. By enabling both data engineers and non-technical people to examine data using AI models, it is transforming Power BI into a human-centered analytics platform. Author Bio: Sagar Lad is a Cloud Data Solution Architect with a leading organization and has deep expertise in designing and building Enterprise-grade Intelligent Azure Data and Analytics Solutions. He is a published author, content writer, Microsoft Certified Trainer, and C# Corner MVP.Medium , Amazon , LinkedIn

0
0
425

Building an API for Language Model Inference using Rust and Hyper - Part 2

Building an API for Language Model Inference using Rust and Hyper - Part 1

Spark and LangChain for Data Analysis

Unleashing the Power of Wolfram Alpha API with Python and ChatGPT

Transformer Building Blocks

Exploring Token Generation Strategies

Text Classification with Transformers

Designing Decoder-only Transformer Models like ChatGPT

Generative Fill with Adobe Firefly (Part II)

Generative Fill with Adobe Firefly (Part I)

Trending Topics

Adobe Firefly Feature Deep Dive

Generative Recolor with Adobe Firefly

Adobe Firefly Integrations in Illustrator and Photoshop

Getting Started with AWS CodeWhisperer

Co-Pilot & Microsoft Fabric for Power BI