LangGraph Redis Checkpoint 0.1.0: From “Make it work" to “Make it fast"
A performance-driven redesign for production AI agents
When we first announced LangGraph Redis integration earlier this year, we were focused on bringing Redis' reliability and simplicity to the world of stateful AI agents. The initial releases (pre-0.1.0) were firmly in the "make it work" phase—we adapted the PostgreSQL reference implementation's patterns to Redis, ensuring functional parity across the LangGraph ecosystem.
After months of real-world usage and valuable feedback from early adopters, we recognized an opportunity to leverage Redis' performance story. Version 0.1.0 represents a fundamental redesign—not just optimizations, but a complete rethinking of how to structure checkpoint data for a high-performance in-memory data store.
From SQL patterns to NoSQL performance
The original implementation followed the normalized, relational patterns of the PostgreSQL reference checkpointer. While this ensured compatibility and correctness, it didn't take advantage of Redis' unique strengths. We embarked on a comprehensive refactoring that embraces denormalization and Redis-native data structures.
The key architectural changes
1. Denormalized storage: Inline channel values
The most impactful change was moving from a normalized storage model to an inline, document-oriented approach:
Before (Normalized):
After (Denormalized):
This single change eliminated O(m) FT.SEARCH queries per checkpoint retrieval (where m = number of channels), replacing them with a single JSON.GET operation.
2. Sorted Sets for write tracking
Instead of relying on FT.SEARCH queries to find pending writes, we introduced a Redis sorted set-based registry system:
This allows us to:
- Check write existence with ZCARD (O(1))
- Retrieve all write keys with ZRANGE (O(log(n) + m))
- Batch fetch writes with pipelined JSON.GET operations
3. Aggressive pipelining
We transformed sequential operations into batched pipeline executions. For example, the list_checkpoints operation now uses a three-phase pipeline approach:
This reduces network round trips from O(n) to O(3), regardless of the number of checkpoints.
The numbers: Benchmarking methodology
To measure the impact of these changes, we developed a comprehensive benchmarking suite that tests all LangGraph checkpointer implementations (that we could get to work) using TestContainers for consistent, reproducible environments. The benchmarks:
- Run 5 iterations of each operation, taking the median
- Use Redis 8.0 with default configuration (no connection pooling optimization, 8 Query Engine worker threads)
- Test realistic agent workflows, including the LangGraph fanout pattern
- Measure operations per second for direct comparisons
All database containers run on the same machine with identical resource constraints, ensuring fair comparisons.
Performance gains by operation
Get Checkpoint
- Before: 238 ops/sec (4.21ms)
- After: 2,950 ops/sec (0.34ms)
- Improvement: 12.4x faster
The inline storage model transforms checkpoint retrieval from multiple search operations to a single JSON.GET.
List Checkpoints
- Before: 22 ops/sec (45.03ms)
- After: 696 ops/sec (1.44ms)
- Improvement: 31.6x faster
Batch loading of pending writes via sorted sets eliminates the O(n) FT.SEARCH queries.
Put Checkpoint
- Before: 2,596 ops/sec (0.39ms)
- After: 1,647 ops/sec (0.61ms)
- Trade-off: 37% slower but includes inline storage setup
The slight regression in put operations is intentional—we're doing more work upfront (storing channel values inline, maintaining the key registry) to dramatically accelerate read operations.
Fanout pattern: The scalability test
The fanout benchmark tests a critical LangGraph pattern where work is distributed to multiple parallel subgraphs—a common pattern for agent swarms and parallel task execution.
Fanout-100 (100 parallel branches):
- Redis: 846ms (ranked 5th overall)
- Faster than PostgreSQL (1,959ms) and MySQL (7,183ms)
Fanout-500 (500 parallel branches):
- Redis: 4,578ms (ranked 5th overall)
- Scales better than PostgreSQL (9,997ms) and MySQL (34,637ms)
The fanout pattern stresses checkpoint creation, write tracking, and state merging—areas where our optimizations shine at scale.
Comparative performance
With these optimizations, Redis checkpoint operations now consistently outperform several alternatives:
Get checkpoint performance (operations/second):
- Memory: 8,392 ops/sec
- SQLite: 7,083 ops/sec
- Redis: 2,950 ops/sec ✨
- MySQL: 1,152 ops/sec
- PostgreSQL: 1,038 ops/sec
- MongoDB: 659 ops/sec
List checkpoints performance (operations/second):
- Memory: 21,642 ops/sec
- SQLite: 5,766 ops/sec
- Redis: 696 ops/sec ✨
- PostgreSQL: 695 ops/sec
- MySQL: 664 ops/sec
- MongoDB: 126 ops/sec
Additional Optimizations
Beyond the Redis-specific changes, we implemented several Python-level optimizations:
- orjson: Replaced standard json library with orjson for 2-3x faster serialization
- Cached key generation: Reduced string formatting overhead
- asyncio.gather(): Parallelized independent async operations
- Eliminated redundant fetches: Fixed a bug in shallow async that caused duplicate checkpoint retrievals
Room for further optimization
These benchmarks use default Redis configurations. Production deployments can achieve even better performance by tuning. It is also fair to note that we did not do any fine-tuning of the other systems benchmarked, as this was not an effort to beat any of them but to make our implementation the best possible for our customers:
- FT.SEARCH worker threads: Increase from default for parallel search operations
- Connection pooling: Configure pool size based on concurrent load
- Redis cluster mode: Distribute load across multiple nodes
Breaking changes and migration
Version 0.1.0 introduces breaking changes in the storage format. New checkpoints use inline channel values, while the system cannot read pre-0.1.0 checkpoints without migration. This was a deliberate decision—we prioritized performance for new deployments over maintaining backward compatibility (at this early stage in the library's life), which would compromise the optimizations.
Looking forward
The journey from 0.0.x to 0.1.0 demonstrates our commitment to making Redis not just a functional choice for LangGraph checkpointing, but a performance-optimized one. By embracing Redis' strengths—in-memory operations, efficient data structures, and pipelining—we've significantly changed the performance signature of the project.
Whether you're building agent swarms that spawn hundreds of parallel tasks or need sub-millisecond checkpoint retrieval for real-time applications, LangGraph Redis 0.1.0 delivers the performance that production AI systems demand.
Getting Started
For async applications, use AsyncRedisSaver for even better performance in high-concurrency scenarios.
Acknowledgments
This performance journey wouldn't have been possible without the feedback from early adopters who pushed the library by building some impressive stateful AI agent systems. Your real-world usage patterns and performance requirements shaped every optimization in this release.
Ready to build?
Ready to build high-performance AI agents with Redis?
Check out the docs and join our community to share your experiences with LangGraph Redis 0.1.0.