What is LangChain?
Created in 2022 by Harrison Chase, LangChain is an open-source Python framework for building LLM-powered applications. It provides developers with modular, easy-to-use components for connecting language models with external data sources and services. The project has attracted millions in venture capital funding from the likes of Sequoia Capital and Benchmark, who supplied funding to Apple, Cisco, Google, WeWork, Dropbox, and many other successful companies.
Building impactful LLM apps involves challenges like prompt engineering, bias mitigation, productionizing, and integrating external data. LangChain reduces this learning curve through its abstractions and composable structure. It simplifies the development of sophisticated LLM applications by providing reusable components and pre-assembled chains. Its modular architecture abstracts access to LLMs and external services into a unified interface. Beyond basic LLM API usage, LangChain facilitates advanced interactions like conversational context and persistence through agents and memory. This facilitates building chatbots, gathering external data, and more.
In particular, LangChain’s support for chains, agents, tools, and memory allows developers to build applications that can interact with their environment in a more sophisticated way and store and reuse information over time. Its modular design makes it easy to build complex applications that can be adapted to a variety of domains. Support for action plans and strategies improves the performance and robustness of applications. The support for memory and access to external information reduces hallucinations, thus enhancing reliability. As a result, developers can combine these building blocks to carry out complex workflows.
The key benefits LangChain offers developers are:
- Modular architecture for flexible and adaptable LLM integrations.
- Chaining together multiple services beyond just LLMs.
- Goal-driven agent interactions instead of isolated calls.
- Memory and persistence for statefulness across executions.
- Open-source access and community support.
As mentioned, LangChain is open source and written in Python, although companion projects exist that are implemented in other languages such as JavaScript or – more precisely – TypeScript (LangChain.js
), for Go, Rust, and Ruby (the fledgling Langchain.rb
project, which comes with a Ruby interpreter for code execution). In this book, we focus on the Python flavor of the framework since it is the language that most data scientists and developers would be most familiar with and it’s the most advanced compared to the others.
While resources like documentation, courses, and communities help accelerate the learning process, developing expertise in applying LLMs takes dedicated time and effort. For many developers, the learning curve can be a blocking factor to impactfully leveraging LLMs. However, there are active discussions on Discord chat servers, multiple blogs, and regular meetups taking place in various cities, including San Francisco and London. There’s even a chatbot, ChatLangChain, that can answer questions about the LangChain documentation. Built using LangChain and FastAPI, it’s hosted at https://chat.langchain.com/.
LangChain comes with many extensions and a larger ecosystem that is developing around it. A few extensions are being developed in tandem with the LangChain framework:
- LangSmith is a platform that complements LangChain by providing robust debugging, testing, and monitoring capabilities for LLM applications. For example, developers can quickly debug new chains by viewing detailed execution traces. Alternative prompts and LLMs can be evaluated against datasets to ensure quality and consistency. Usage analytics empower data-driven decisions around optimizations.
- LangChain templates stand as a comprehensive repository for builders aiming to harness the capabilities of LLMs. Offering a suite of templates and a central hub for resources, including extensive documentation and a community platform for discussions, these templates support the development of production-ready applications. They also provide educational resources through introductions, prompt templates, and insights into conversational memory, making them a pivotal learning tool for those new to LangChain.
- LangServe helps developers deploy LangChain runnables and chains as a REST API. As of April 2024, a hosted version has been announced. LangServe streamlines the deployment and management of LLM applications, featuring automatic schema inference (through Pydantic) and a range of efficient API endpoints that ensure scalability and an engaging interactive playground for developers.
- LangGraph facilitates the development of complex LLM applications with cyclic data flows and stateful multi-actor scenarios. Its design, emphasizing simplicity and scalability, allows for the creation of flexible, enhanced runtime environments that support intricate data and control flows, ensuring compatibility and integration with the LangChain ecosystem.
LangChain has an immense number of third-party integrations already, with many new ones being added every week. There’s a wide array of tools providing integration with TruLens, Twitter, Typesense, Unstructured, Upstash Redis, Apify, Wolfram Alpha, Google Search, OpenWeatherMap, and Wikipedia, to name just a few. These tools and features provide a comprehensive ecosystem for developing, managing, and visualizing LLM applications, each with unique capabilities and integrations that enhance the functionality and user experience of LangChain.
Many third-party applications have been built on top of LangChain or around it. For example, LangFlow and Flowise introduce a more interactive dimension to LLM development, with UIs that allow for the visual assembly of LangChain components into executable workflows. This drag-and-drop simplicity enables quick prototyping and experimentation, lowering the barrier to entry for complex pipeline creation, as illustrated in the following screenshot of Flowise (source: https://github.com/FlowiseAI/Flowise):
Figure 2.6: Flowise UI with an agent that uses an LLM, a calculator, and a search tool
You can see an agent (discussed later in this chapter) that is connected to a search interface (Serp API), an LLM, and a calculator. LangChain and LangFlow can be deployed locally, for example, using the Chainlit library, or on different platforms, including Google Cloud. The langchain-serve
library helps to deploy both LangChain and LangFlow on the Jina AI cloud as LLM-apps-as-a-service with a single command.
While still relatively new, LangChain unlocks more advanced LLM applications via its combination of components like memory, chaining, and agents. It aims to simplify what can otherwise be complex LLM application development. Hence, it is crucial at this point in the chapter that we shift focus to the workings of LangChain and its components.