Notes on the Current State of LLM Frameworks

This post tries to shed some light on the rapidly changing LLM frameworks in particular, LangChain (LC) and Llama-index (LI).

Library vs. Framework

It’s tricky to draw a clear boundary between a package/library and a framework, but for the sake of discussion, let’s look at some well-known examples

  • Packages: Numpy falls into this category. It provides functionality that can be adapted to various Linear Algebra problems without dictating a particular structure or methodology.
  • Mature Frameworks: Scikit-learn, PyTorch and HuggingFace transformers. They provide high-level abstractions but take customization and non-opinionated design seriously. As a result, they have become the de facto ways of doing ML/AI.
  • Immature Frameworks: LangChain (LC) and Llama-index (LI) are examples in this category. Unlike packages, they offer higher-level abstractions and impose a specific way of building software. (Within this category, I’ve found that LI provides better abstractions than LC).

I’ve been using LC and LI for seven months now, and their evolution exemplifies a broader trend in the ML/AI world. While they’ve been great for quickly creating a Proof-of-Concept, their higher-level abstractions miss many details and nuances of components that are hard to include in a clean abstraction.

Take VectorDB/VectorStore; within a few months, their numbers have grown quickly. The quality of each is up to the individual to decide and try (I have my own suggestions, perhaps for later). For a long time, none of these frameworks offered a simple, clear CRUD API. Also, each VectorDB has its nuances of handling embeddings, from storage to search. Some support async, and some don’t. Some support gRPC, and some don’t. Some support hybrid-search, and some don’t. To use them at scale for production, we need to know about these nuances to squeeze every bit of performance. So what LC and LI did was provide a base VectorStore class and implement methods like add_documents (and their async aadd_documents) and wrap them without exposing the individual nuances of VectorDBs.

Mostly in LC, almost every abstracted away component is opinionated and hard to customize. I can add async, streaming, callback-hell, agent and memory to the list. Not everything needs to be a subclass of pydantic BaseModel! I think “true chaining” is not there yet in LC (should admit overloading __or__ is an interesting new way, similar to Unix pipe), and the many ways of calling a chain/agent with run, __call__, apply (now with added async) are unhealthy. These criticisms of LC are mostly to the point but I also think given the rapid change of landscape both LC and LI provide values esp. as a generic off-the-shelf solutions.


I believe that both frameworks still have a lot of room to improve. They should take a non-opinionated approach more seriously and expose a better lower-level API in order to become the de facto frameworks for either building LLM agents or connecting data sources to LLMs.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About Me

I’m a Machine Learning Engineer. If you want to know more about me and my work, please check out my GitHub and Linkedin.


%d bloggers like this: