← Back to blog

Field notes • Production AI

The High Cost of Convenience: Why I Choose Postgres over MongoDB for Production LangChain Checkpointing

February 28, 2026

The High Cost of Convenience: Why I Choose Postgres over MongoDB for Production LangChain Checkpointing

Key takeaways: For production LangChain/LangGraph checkpointing, Postgres (PostgresSaver) beats MongoDB Atlas on cost, control, and long-term scalability; write-heavy checkpointing makes the DB a major cost driver; pgvector lets you colocate vector search with checkpointing for one stack and one backup; "schema-less" NoSQL doesn't meaningfully help when the framework defines the state schema.

Everyone can stand up a LangChain "hello world" with a local SQLite checkpointer. The real work starts when you move to production: multi-turn agents, persistent state, and the need for a database that can handle write-heavy checkpointing at scale. At that point you're faced with two primary paths in the LangChain ecosystem—MongoDBSaver or PostgresSaver. While MongoDB (and Atlas) offer a white-glove experience and faster setup for teams with bigger budgets, I consistently choose Postgres for teams that care about cost-efficiency, architectural control, and long-term scalability. Here's why.

The Abstraction Illusion

LangChain—and LangGraph in particular—abstracts the database. Whether you're backing your checkpointer with ACID (Postgres) or BASE (Mongo), the library handles the heavy lifting: state serialization, checkpoint tables or collections, and the plumbing your agent needs to resume and branch. From the application's perspective, you swap one saver for another and move on.

That abstraction can obscure an important point: the "schema-less" advantage of NoSQL is largely irrelevant here. Your AI state has an implicit schema. LangChain defines it; the framework manages the tables or collections either way. So the flexibility that makes Mongo appealing for ad-hoc document storage doesn't translate into a meaningful benefit for this use case. Once you're in the LangGraph world, you're not choosing "flexibility vs. rigidity"—you're choosing an operational and cost profile.

The "Hand-Holding" Trap

The MongoDB story is seductive. Atlas gives you sales engineers, polished docs, and a clear path to add Atlas Vector Search later. It's an easy sell to leadership: "It just works." For teams that want to minimize operational ownership and have budget to spare, that's a valid trade.

The Postgres path is different. It's open source and battle-tested. Nobody from Postgres Inc. is calling you. It's DIY—but in the best sense. As a software engineer working on production AI at scale, "DIY" means you own your destiny. You're already running Postgres on AWS RDS (or equivalent); you already have backups, monitoring, and IAM roles. Adding PostgresSaver for LangGraph checkpointing doesn't introduce a new vendor, a new billing relationship, or a new set of failure modes. That comfort with your own infra isn't nostalgia—it's a superpower in the AI era, when so much else (models, latency, output quality) is already non-deterministic.

The Real Difference: The Hidden Math of Cost

Where the choice gets decisive is cost—especially at production scale.

Checkpointing is write-heavy. In LangGraph, every step in an agent's graph can persist state. For a 10-step run you're not storing one record; you're storing up to 10 snapshots. At high traffic, that makes the database a primary cost driver. Managed MongoDB Atlas tiers are priced for convenience and feature set; you pay for the platform, not just raw compute and I/O. A standard RDS Postgres instance, by contrast, gives you predictable capacity. You're paying for the box. The more you lean on it—checkpointing plus application data, and soon vector search—the better the unit economics.

That leads to the strongest argument for Postgres: the vector surcharge. Once you move from "just" checkpointing to RAG, you need vector search. With MongoDB Atlas, Vector Search is often an add-on with query-based pricing and proprietary hooks. With Postgres and pgvector, your vector search lives where your data already lives. If you're already paying for the compute and IOPS of that RDS instance, the incremental cost of adding a vector column and an index is negligible. One stack, one backup, one security model.

RBAC and security: Both Postgres and MongoDB offer encryption at rest; LangGraph can layer its own AES encryption for checkpoint payloads. Choosing Postgres doesn't mean sacrificing security for price—you're choosing a different operational and cost profile, not a weaker one.

Architecting for the Long Game

If you have a large budget and need to ship yesterday, Mongo is fine. But for the systems-minded engineer—and for eng leaders and VPs who care about total cost of ownership and architectural autonomy—Postgres wins. You get a single place for chat history (checkpointing) and knowledge base (vector store): simpler VPCs, fewer IAM roles, one backup and restore story. That's not just cheaper; it's easier to reason about and to evolve.

AI is non-deterministic enough. Your infrastructure shouldn't be. Choose the tool that gives you predictable bills and predictable performance—and the control to scale it the way your product actually needs.