LangChain: Pros and Cons
Over the past 90 days, I've developed and run a personal (private) application powered by large language models using LangChain. LangChain has emerged as the de facto standard for such apps. I'm writing this post to outline my impression of the design and direction of the library.
It's currently unpleasant to use.
If the major version bump from v0.x.x to v1.0.0 involves massive pruning, it will remain the standard. If the dev team seeks to please all existing contributors, it'll be quickly supplanted by something more concise and directed.
LangChain has a big and active community. In the same way that React.js is a standard in spite of it not being the best choice, LangChain will likely be the standard we land on. It's the rational choice if you select based on the principle of survival in numbers.
Retrieval Augmented Generation
LangChain excels for Retrieval Augmented Generation1. It comes with impressive out-of-the-box support for document splitters. These are necessary because the documents we're retrieving almost never fit in our limited context windows. Namely the MarkdownHeaderTextSplitter.
LangChain excels for Agent2 execution. I hand-implemented my own LLM Agent. I enthusiastically abandoned all of my work for the ReAct agent. The implementation is brittle to extend but it does work well. My moments of greatest amazement at the capacity of LLMs have been during long sequences of interactive chain-of-thought reasoning. I anticipate Agents being the fixed-wing on which we become more comfortable referring to these systems as AGI.
Chat vs. Completion
It's not LangChain's fault, but they're at the mercy of the industry switch from Completion APIs to ChatCompletion APIs. LangChain's first release was January 26, 2023. On March 1, 2023, OpenAI introduced the ChatGPT API which abstracts away mere token completion under a Human:, AI:, Human:, AI: conversation chain—much like a screenplay. If you want to simulate, say, a conversation between more than just a Human and an AI character, you'll need to just overlay your conversation within the Human-AI screenplay.
There are multiple ways to do just about everything. Which leads to inconsistency of abstractions through even small codebases. The inconsistency is even more glaring across LangChain's documentation. For me, this is most painful across the several competing approaches to Retrievers and the shoe-horning of Agents, tools, and chains into the same duck type.
The documentation, more often than not, falls victim to its own competing abstractions. If you pursue the ReAct approach to agency (which you absolutely should), you'll likely run across Agent Types: ReAct. Little will you know that you're stumbling into a territory dispute with Conversational ReAct Description agent. The latter supports memory, the former does not.
If you're like me, you assumed that PromptTemplating would be some ORM'esque defense against injection or mitigation of complexity. After all of this time, I still do not understand the advantage of PromptTemplates. They're a class hierarchy permeating chat messages, system messages, and simple completions. Here's its epicenter on Github, PromptTemplate, as of this writing. It's 150+ lines of wrapper around... Python string.format().
Langchain, with its litany of competing abstractions, is still a healthy Open-Source codebase. I intend to continue using it for my own projects, but I'll strive to use it modularly. The power of RAG and Agents outweighs the annoyance of inconsistency in a beta codebase. My advice to anyone taking it on would be to use it in pieces, maintaining your own application architecture. This will better equip you to migrate whatever changes when the software goes stable.
1 Retrieval Augmented Generation (RAG) is a prevailing must in LLM applications. Because Large Language Models optimize for perceived correctness, any utilization of them should be reinforced with cited retrieval as much as possible.
2 An agent is a simulated entity in a chat or text completion chain. That simulated entity has access to tools and may "invoke" those tools by expressing so in text. When we observe that expression, we invoke the tool on the agent's behalf.Back to posts
Copyright © Kevin Katz 2023