LLMs – reasoning engines for intelligent apps
LLMs are the key technology of intelligent applications, unlocking whole new classes of AI-powered systems. These models are trained on vast amounts of text data to understand language, generate human-like text, answer questions, and engage in dialogue.
LLMs undergo continuous improvement with the release of new models. featuring billions or trillions of parameters and enhanced reasoning, memory, and multi-modal capabilities.
Use cases for LLM reasoning engines
LLMs have emerged as a powerful general-purpose technology for AI systems, analogous to the central processing unit (CPU) in traditional computing. Much like CPUs, LLMs serve as general-purpose computational engines that can be programmed for many tasks and play a similar role in language-based reasoning and generation. The general-purpose nature of LLMs lets developers use their capabilities for a wide range of reasoning tasks.
A crop of techniques to leverage the diverse abilities of LLMs have emerged, such as:
- Prompt engineering: Using carefully crafted prompts, developers can steer LLMs to perform a wide range of language tasks. A key advantage of prompt engineering is its iterative nature. Since prompts are fundamentally just text, it’s easy to rapidly experiment with different prompts and see the results. Advanced prompt engineering techniques, such as chain-of-thought prompting (which encourages the model to break down its reasoning into a series of steps) and multi-shot prompting (which provides the model with example input/output pairs), can further enhance the quality and reliability of LLM-generated text.
- Fine-tuning: Fine-tuning involves starting with a pre-trained general-purpose model and further training it on a smaller dataset relevant to the target task. This can yield better results than prompt engineering alone, but it comes with certain caveats, such as being more expensive and time-consuming. You should only fine-tune after exhausting what you can achieve through prompt engineering.
- Retrieval augmentation: Retrieval augmentation interfaces LLMs with external knowledge, allowing them to draw on up-to-date, domain-specific information. In this approach, relevant information is retrieved from a knowledge base and injected into the prompt, enabling the LLM to generate contextually relevant outputs. Retrieval augmentation mitigates the limitations of the static pre-training of LLMs, keeping their knowledge updated and reducing the likelihood of the model hallucinating incorrect information.
With these techniques, you can use LLMs for a diverse array of tasks. The next section explores current use cases for LLMs.
Diverse capabilities of LLMs
While fundamentally just language models, LLMs have shown surprising emergent capabilities (https://arxiv.org/pdf/2307.06435). As of writing in spring 2024, state-of-the-art language models are capable of performing tasks of the following categories:
- Text generation and completion: Given a prompt, LLMs can generate coherent continuations, making them useful for tasks such as content creation, text summarization, and code completion.
- Open-ended dialogue and chat: LLMs can engage in back-and-forth conversations, maintaining context and handling open-ended user queries and follow-up questions. This capability is foundational for chatbots, virtual assistants, tutoring systems, and similar applications.
- Question answering: LLMs can provide direct answers to user questions, perform research, and synthesize information to address queries.
- Classification and sentiment analysis: LLMs can classify text into predefined categories and assess sentiment, emotion, and opinion. This enables applications such as content moderation and customer feedback analysis.
- Data transformation and extraction: LLMs can map unstructured text into structured formats and extract key information, such as named entities, relationships, and events. This makes LLMs valuable for tasks such as data mining, knowledge graph construction, and robotic process automation (RPA).
As LLMs continue to grow in scale and sophistication, new capabilities are constantly emerging, often in surprising ways that were not directly intended by the original training objective.
For example, the ability of GPT-3 to generate functioning code was an unexpected discovery. With advancements in the field of LLMs, we can expect to see more impressive and versatile capabilities emerge, further expanding the potential of intelligent applications.
Multi-modal language models
Multi-modal language models hold particular promise for expanding the capabilities of language models. Multi-modal models can process and generate images, speech, and video in addition to text, and have become an important component of intelligent applications.
Examples of new application categories made possible with multi-modal models include the following:
- Creating content based on multiple input types, such as a chatbot where users can provide both images and text as inputs.
- Advanced data analysis, such as a medical diagnosis tool that analyzes X-rays along with medical records.
- Real-time translation, taking audio or images of one language and translating it to another language.
Such examples highlight how multi-modal language models can enhance the possible use cases for language models.
A paradigm shift in AI development
The rise of LLMs represents a paradigm shift in the development of AI-powered applications. Previously, many reasoning tasks required specially trained models, which were time-intensive and computationally expensive to create. Developing these models often necessitated dedicated machine learning (ML) engineering teams with specialized expertise.
In contrast, the general-purpose nature of LLMs allows most software engineers to leverage their capabilities through simple API calls and prompt engineering. While there is still an art and science to optimizing LLM-based workflows for production deployability, the process is significantly faster and more accessible compared to traditional ML approaches.
This shift has dramatically reduced the total cost of ownership and development timelines for AI-powered applications. NLP tasks that previously could take months of work by a sophisticated ML engineering team can now be achieved by a single software engineer with access to an LLM API and some prompt engineering skills.
Moreover, LLMs have unlocked entirely new classes of applications that were previously not possible or practical to develop. The ability of LLMs to understand and generate human-like text, engage in open-ended dialogue, and perform complex reasoning tasks has opened up a wide range of possibilities for intelligent applications across industries.
You’ll learn more about LLMs in Chapter 3, Large Language Models, which discusses their history and how they operate.