Defining intelligent applications
Traditional applications typically consist of a client-side user interface, a server-side backend, and a database for data storage and retrieval. They perform tasks following a strict set of instructions. Intelligent applications require a client, server, and database as well, but they augment the traditional stack with AI components.
Intelligent applications stand out by understanding complex, unstructured data to enable natural, adaptive interactions and decision-making. Intelligent applications can engage in open-ended interactions, generate novel content, and make autonomous decisions.
Examples of intelligent applications include the following:
- Chatbots that provide natural language responses based on external data using retrieval-augmented generation (RAG). For example, Perplexity.ai (https://www.perplexity.ai/) is an AI-powered search engine and chatbot that provides users with AI-generated answers to their queries based on sources retrieved from the web.
- Content generators that let you use natural language prompts to create media such as images, video, and audio. There are a variety of intelligent content generators focusing on different media types, such as Suno (https://suno.com/) for text-to-song, Midjourney (https://www.midjourney.com/home) for text-to-image, and Runway (https://runwayml.com/) for text-to-video.
- Recommendation systems that use customer data to provide personalized suggestions based on their preferences and history. These suggestions can be augmented with natural language to further personalize the customer experience. An example of this is Spotify’s AI DJ (https://support.spotify.com/us/article/dj/), which creates a personalized radio station, including LLM-generated DJ interludes, based on your listening history.
These examples are a few early glances at the new categories of intelligent applications that developers have only started to build. In the next section, you will learn more about the core components of intelligent applications.
The building blocks of intelligent applications
At the heart of intelligent applications are two key building blocks:
- The reasoning engine: The reasoning engine is the brain of an intelligent application, responsible for understanding user input, generating appropriate responses, and making decisions based on available information. The reasoning engine is typically powered by large language models (LLMs)—AI models that perform text completion. LLMs can understand user intent, generate human-like responses, and perform complex cognitive tasks.
- Semantic memory: Semantic memory refers to the application’s ability to store and retrieve information in a way that preserves its meaning and relationships, enabling the reasoning engine to access relevant context as needed.
Semantic memory consists of two core components:
- AI vector embedding model: AI vector embedding models represent the semantic meaning of unstructured data, such as text or images, in large arrays of numbers.
- Vector database: Vector databases efficiently store and retrieve vectors to support semantic search and context retrieval.
The reasoning engine can retrieve and store relevant information from the semantic memory, using unstructured data to inform its outputs.
The LLMs and embedding models that power intelligent applications have different hardware requirements than traditional applications, especially at scale. Intelligent applications require specialized model hosting infrastructure that can handle the unique hardware and scalability requirements of AI workloads. Intelligent applications also incorporate continuous learning, safety monitoring, and human feedback to ensure quality and integrity.
LLMs are the vital organ for intelligent applications. The next section will provide a deeper understanding of the role of LLMs in intelligent applications.