Your agents aren't failing. Their context is.

See how we fix it

Tutorial

Redis Real Time Context Engine : Build a memory-aware AI agent with Redis Agent Memory and Redis Context Retriever

May 18, 202645 minute read
Prasan Rajpurohit
Prasan Rajpurohit
TL;DR: AI agents forget everything when a conversation ends, and can't query structured data without custom code. Redis solves both. Redis Context Retriever turns your entity data into auto-generated MCP tools any agent can discover and call. Redis Agent Memory gives agents persistent session memory and cross-session long-term memory backed by vector search. Together they make Redis a full context engine for AI applications.
Redis Agent Memory Explorer — wealth advisor demo showing session memory, long-term memory, and AI copilot panels
Note: This tutorial uses the code from the following git repository:
https://github.com/redis-developer/redis-agent-memory-explorer
When a wealth advisor meets with a client every month, they're expected to remember what was discussed last quarter, the client's risk tolerance, their family situation, and every commitment made across a dozen previous meetings. Without a memory system, every LLM-powered assistant starts blank. The context window fills up fast, the session ends, and everything is forgotten.
This is the memory problem — and it has two distinct parts:
  • Conversational memory: what was said, what was decided, what the client revealed across many sessions over time
  • Structured knowledge: portfolio holdings, financial goals, pending action items — data that belongs in a queryable store, not a chat log
In this tutorial, you'll build the Wealth Advisor Agent Memory Explorer — a demo that solves both problems using two new Redis Cloud capabilities: Redis Context Retriever for structured data as MCP tools, and Redis Agent Memory for persistent conversational memory. By the end, you'll have a running LangGraph-powered chatbot agent that can answer questions by querying both data layers and showing exactly where each answer came from.

#How the two data layers compare

Before diving in, here's what each layer does and when to reach for it:
Redis Context RetrieverRedis Agent Memory
What it storesStructured records (entities, facts, records)Conversational events and extracted memories
Data sourceHand-authored JSON, loaded onceGenerated from conversation during playback
Query styleTAG filters, full-text search, range queriesSemantic vector search, session scoping
How agents access itAuto-generated MCP toolsSDK methods (searchLongTermMemory, buildMemoryPrompt)
Grows over timeStatic per dataset loadYes — every session adds more memories
Best for"What are James's holdings?""What did James say about bonds last month?"

#Prerequisites


#Setup

Clone the repo and copy the environment template:
Fill in your .env:
Note: Leave CTX_SURFACE_ID and MCP_AGENT_KEY blank — the app creates and writes them on first run.
Start the app:
Open http://localhost:3001. On first run, the backend creates a Redis Context Retriever, loads entity records from data/wealth-advisor/client-data.json, and writes CTX_SURFACE_ID and MCP_AGENT_KEY back to .env. Subsequent runs skip this step and reuse the existing retriever(surface).

#What you'll build

The Wealth Advisor Agent Memory Explorer is a demo where Sarah Chen, a relationship manager at Acme Bank, plays back recorded meeting transcripts with her client James Morrison. The app stores those conversations in Redis Agent Memory, extracts long-term facts automatically, and exposes structured client data via Redis Context Retriever. A LangGraph ReAct chatbot can then query both layers to answer questions with source attribution.

#Architecture

Note: The app runs as two processes: the API server (port 3001) and the LangGraph server (port 2024). In Docker, these are two separate containers from the same image. Both initialize their own Redis Agent Memory and Redis Context Retriever clients on startup.

#Tech stack

LayerTechnology
FrontendNext.js 14 (App Router), React, CopilotKit
BackendNode.js, Express
Chatbot agentLangGraph, LangChain, createReactAgent
MemoryRedis Agent Memory (Redis Cloud)
Structured dataRedis Context Retriever (Redis Cloud)
LLMOpenAI gpt-4o-mini
Local auxiliary storesRedis (JSON store — suggestions, topics, transcript chunks)

#Redis Context Retriever

#What is Redis Context Retriever?

Redis Context Retriever is a Redis Cloud service that takes your structured entity data and turns it into auto-generated MCP (Model Context Protocol) tools that any agent can discover and call. You define an entity schema, load records, and Redis handles the rest — generating a full set of query tools without any custom API code.
For the wealth advisor demo, the entity schema defines four entities: Client, Holding, FinancialGoal, and ActionItem. Redis Context Retriever generates tools like filter_holding_by_asset_class, search_financialgoal_by_text, and find_holding_by_current_value_range — each self-describing, each callable by the LangGraph agent at runtime.

#Why this matters

Traditional approachWith Redis Context Retriever
Build and maintain a custom API for every data sourceAuto-generated MCP tools, no custom API code
Hardcode entity knowledge in the agent's system promptTools are self-describing; the agent discovers them at runtime
Adding a new queryable field requires code changesUpdate the schema and reload records
One integration per data sourceOne retriever (surface) per dataset; the same agent queries them all

#Key concepts

  • Retriever (Surface): A named collection tied to a data source (Redis) and an entity schema. Think of it as a queryable namespace. You can view and manage all your surfaces in the Redis Cloud Console.
  • Entity schema: Defines your data model — field names, types, and descriptions. This drives tool generation.
  • Admin key: Used to create and manage surfaces and issue agent keys. Generate one from the Context Retriever admin keys page in Redis Cloud and set it as CTX_ADMIN_KEY in your .env.
  • Agent key: A scoped key the agent uses to call MCP tools, separate from the admin key used for retriever (surface) management.
  • MCP tools: Auto-generated and self-describing. Tool names encode the query pattern — filter_<entity>_by_<field>, search_<entity>_by_text, get_<entity>_by_id, find_<entity>_by_<field>_range.
Once the wealth-advisor surface is created, the Redis Cloud Console shows the full data model with all four entities — Client, Holding, FinancialGoal, and ActionItem — and confirms that MCP tools have been generated and are available for AI agents:
Redis Cloud Console showing the wealth-advisor context surface with 4 entities and 25 auto-generated MCP tools

#What the entity schema looks like

Before creating a surface, you define the entity schema in client-data.json. Each entity declares field names, types, descriptions, and the Redis index type for each field — this is what Context Retriever reads to auto-generate the MCP tools.
Here is the Client entity (the primary entity) and the Holding entity (a related entity linked via client_id):
Note: The redisIndices field controls which query tools are generated. A tag field produces a filter_<entity>_by_<field> tool. A numeric field produces a find_<entity>_by_<field>_range tool. A text field produces a search_<entity>_by_text tool. isKeyComponent: true produces a get_<entity>_by_id tool.

#What the actual records look like

The records section of client-data.json holds the data that will be loaded into the surface. For the demo there is one client, James Morrison, with six holdings:
Each Holding record carries a client_id that matches the Client record, which is how filter_holding_by_client_id knows which holdings to return when the agent queries for a specific client.

#Creating a context retriever (context surface) and loading records

With the schema and records defined, the backend creates the surface on first run. It reads CTX_SURFACE_ID and MCP_AGENT_KEY from .env — if they're set, it skips creation and reuses the existing retriever. If either is missing, it creates everything from scratch:
After the first run, the surface ID and agent key are printed to the console. Copy these values and set them manually in your .env:
On subsequent runs the backend reads these values from .env and skips surface creation entirely, reusing the existing retriever.

#Discovering and calling MCP tools

At agent startup, the LangGraph server fetches all available tools for the surface and wraps each one as a LangGraph DynamicStructuredTool:
Here's what's happening step by step:
  1. listTools() fetches the current tool list from the MCP server. The tool count and names depend entirely on the entity schema — adding an entity to the schema automatically adds new tools.
  2. buildJsonSchemaToZod() converts each tool's JSON Schema parameters into a Zod schema so LangGraph can validate inputs and generate the tool signature for the LLM.
  3. cs.callTool() sends a JSON-RPC request to the MCP server and returns the structured result, which extractMcpText() unwraps to a plain string for the agent.
The agent now has a set of tools it discovered at runtime — no hardcoded queries, no custom API routes.

#Context Retriever tools in the demo

For the wealth advisor entity schema, Redis Context Retriever generates multiple tools. Here are some examples:
ToolWhat it queries
filter_holding_by_client_idAll portfolio holdings for a client
filter_holding_by_asset_classHoldings by equity, bond, or real estate class
find_holding_by_current_value_rangeHoldings above or below a value threshold
filter_financialgoal_by_client_idAll financial goals for a client
filter_financialgoal_by_typeGoals by type (retirement, education, etc.)
search_financialgoal_by_textFull-text search across goals
filter_actionitem_by_statusPending or completed action items
get_client_by_idClient profile by primary key
Redis Cloud Console showing the full list of 25 auto-generated MCP tools for the wealth-advisor-demo surface, with tool name, operation type, and entity columns
These tools are available to the agent at runtime — no hardcoded queries, no custom API routes. When a user asks a question that requires structured client data, the agent picks the right tool, calls it, and cites its source in the response.
For example, asking "What is James Morrison's portfolio allocation?" causes the agent to invoke filter_holding_by_client_id, retrieve the full holdings breakdown, and return the answer with a SOURCE: CONTEXT RETRIEVER label so the user can always see where the data came from:
Chatbot answering "What is James Morrison's portfolio allocation?" using the filter_holding_by_client_id MCP tool with source attribution to Context Retriever
Note: This is where Context Retriever has a meaningful accuracy advantage over a standard RAG approach. If the portfolio data were stored purely as embeddings and retrieved with a single vector search, the agent would get back a semantically close chunk of text — but it would have no guarantee of completeness or precision. It might miss holdings, return stale text, or conflate records from different clients.
With Context Retriever, the agent operates against structured records through typed MCP tools. It can call filter_holding_by_client_id to get every holding for a client, follow up with find_holding_by_current_value_range to narrow by value, and chain further tool calls as the question demands — each returning exact, queryable data rather than an approximation. The agent isn't doing a one-shot retrieval; it's navigating the data the same way a developer would query a database, just driven by the LLM's reasoning at runtime.

#Redis Agent Memory

#What is Redis Agent Memory?

Redis Agent Memory is a Redis Cloud service that gives AI agents two tiers of persistent memory:
  • Session memory: The live conversation — an ordered log of events for the current session, scoped by sessionId. Each event has a role (user, assistant, system), content, and optional metadata. Session memory is ephemeral by design; it exists as long as the session is active.
  • Long-term memory (LTM): Cross-session, persistent facts and events extracted from conversations. Backed by vector search so agents can retrieve semantically relevant memories regardless of which session produced them.

#Memory types

TypeWhat it storesExample from the demo
episodicEvents with context and time"Client expressed concerns about REIT exposure in the Feb meeting"
semanticFacts, preferences, profile data"Client has a moderate risk tolerance"
messageStored conversation recordsRaw dialogue segments
Note: In this demo, all auto-extracted long-term memories are episodic type because they come from meeting conversations. Redis Agent Memory extracts them automatically in the background — no code required beyond calling addSessionEvent. You can also create LTMs manually via createLongTermMemories() at any time. This is the right approach when you run your own extraction pipeline — processing documents, emails, call transcripts, or any content outside of a live session — and want to persist the resulting memories directly:
Automatic extraction and manual creation coexist — both end up in the same searchable LTM store.

#Session memory — storing a live conversation

When the user presses Play on a transcript, the frontend calls the backend once per chunk. Each chunk is a timestamped dialogue turn from the meeting. The backend formats it and stores it as a session event in Redis Agent Memory:
The session ID is generated when playback starts (playback-<transcriptId>-<timestamp>) and is unique per playback run. Redis Agent Memory creates the session implicitly on the first addSessionEvent call — there's no separate "create session" API call required.
Reading the live session back is equally simple:
The Session memory tab polls this every three seconds during playback, displaying events as they arrive.
Session memory tab showing live events during transcript playback

#Long-term memory — what Redis remembers across sessions

After a transcript plays, Redis Agent Memory analyzes session events in the background and extracts durable facts as long-term memories. These are available for semantic search across all future sessions.
The Long-term memory tab searches LTMs by user, with optional filters for memory type and topics:
To see only the memories extracted from a specific meeting, filter by sessionId instead:
searchAllLongTermMemory automatically paginates through all results, so you always get the full set regardless of volume.
Long-term memory tab with episodic memory cards grouped by session

#The memory prompt — injecting context into any LLM call

The most important Redis Agent Memory method is buildMemoryPrompt. It assembles a token-budgeted context string from session events plus relevant long-term memories — ready to inject directly into any LLM system prompt.
The output format is structured markdown the LLM can immediately use:
How token budgeting works:
  1. Determine total token budget — from contextWindowMax if provided, otherwise a lookup table by model name, otherwise a 128k default
  2. Reserve tokens for the LTM results and formatting overhead
  3. Spend the remaining budget on session events:
    • If all events fit → include verbatim as "Recent Conversation"
    • If events exceed the budget and an LLM is configured → summarize via LLM and include as "Session Summary"
    • If no LLM is configured → trim oldest events and keep the most recent ones
This means the context string is always safe to inject regardless of session length.

#The AI Copilot — real-time suggestions during playback

The AI Copilot tab generates context-aware suggestions as the transcript plays. Every five chunks, the suggestion pipeline fires automatically:
Here's the core of the pipeline — how it hydrates memory context before invoking the suggestion LLM:
The LLM returns a structured JSON response with a suggestion and topic updates:
Topics have a lifecycle: pre-seeded topics from the transcript metadata start as pending, move to discussed as the conversation covers them, and become question when the client asks about them directly.
The suggestion LLM is also passed all previous suggestions to prevent duplicates — if a theme was already used, it returns null for the suggestion field.
AI Copilot tab showing a suggestion banner and detected topics panel with status badges

#The chatbot — querying across both data layers

The CopilotKit chatbot sidebar connects to a LangGraph ReAct agent that has tools from both data layers. This is the centrepiece of the demo: the agent receives a natural language question, reasons about which tools to call, calls them, and synthesizes an answer — citing which data layer it used.
Chatbot sidebar showing source badge "RAM + Context Retriever" on a combined answer

#How the agent is wired

The LangGraph graph is a single-node ReAct agent. At startup it initializes both data clients and fetches all available tools:
The tools array contains five hand-written RAM tools plus however many MCP tools Redis Context Retriever generated from the entity schema. The agent sees them all as equal — it doesn't know or care which layer a tool queries.

#RAM tools

ToolWhen the agent uses it
getMemoryContextPrimary tool: returns a full buildMemoryPrompt result — session events + LTM combined — for any question about an active session
searchMemoriesSemantic search across all long-term memories, cross-session
searchMemoriesBySessionSearch long-term memories scoped to a specific meeting
listSessionsWhen the user references a meeting by date or name
getSessionStateSession metadata — event count, owner ID

#The dynamic system prompt

The agent's system prompt is built at startup from the MCP tool definitions, not hardcoded. It parses entity names from tool names (e.g., filter_holding_by_* → "Holding") and generates routing guidance dynamically:
This means adding a new entity to the schema automatically updates the system prompt — no code changes needed.

#Source attribution

The LLM outputs a structured header the frontend parses into a styled badge and collapsible tools disclosure:
The frontend's custom AssistantMessage component splits this into a rendered source badge, a collapsible <details> block listing the tools called, and the answer body rendered as markdown.

#Sample questions and routing

QuestionExpected tools
"What happened in this meeting?"getMemoryContext
"What did James say about bonds last month?"searchMemories
"Summarize the February call"listSessionsgetMemoryContext
"What is James's portfolio allocation?"filter_holding_by_client_id
"Are there any pending action items?"filter_actionitem_by_status
"List all holdings worth more than $500K"find_holding_by_current_value_range
"What are his retirement goals and what was discussed about them?"filter_financialgoal_by_type + searchMemories
"Tell me everything you know about James"Multiple RAM + Context Retriever tools
Chatbot answering a combined question — source badge shows "RAM + Context Retriever", tools disclosure lists both tool types

#How it all works — the Redis data patterns

All the capabilities in this tutorial are powered by a small set of Redis primitives:
FeatureRedis mechanismUsed for
Session memory eventsRedis JSON (append-only event log)Storing transcript chunks per playback session
Long-term memoryRedis vector index (RediSearch)Semantic search across extracted facts and events
Redis Context RetrieverRedis + RediSearch + MCP serverStructured entity queries via auto-generated tools
Suggestion storeRedis JSONPersisting generated AI Copilot suggestions
Topic storeRedis JSONTracking topic lifecycle (pending → discussed → question)
Transcript chunk storeRedis JSONRecent chunk buffer for the suggestion pipeline
The key insight: Redis is doing a different job at each layer. Session memory uses Redis as an ordered event log. Long-term memory uses Redis's vector search to find semantically relevant facts. Context Retriever uses Redis as the backing store for a structured, MCP-queryable entity index. All three coexist in the same Redis Cloud instance.

#Running the demo

Open http://localhost:3001. The app loads the wealth advisor configuration — branding, participant roles, suggestion types — from the backend.
  1. Select a transcript from the dropdown in the left panel and click Play
  2. Watch session memory fill in the Session memory tab (right panel) as each chunk is stored
  3. Watch long-term memories appear in the Long-term memory tab after extraction runs in the background
  4. Check the AI Copilot tab for context-aware suggestions generated every five chunks
  5. Open the chatbot sidebar (top-right button) and ask questions across both data layers
Try a combined question like: "What is James's current equity allocation and what did he say about rebalancing in the last meeting?" — the agent should call filter_holding_by_asset_class for the portfolio data and getMemoryContext or searchMemories for the conversation context, then synthesize the answer.
Click Reset to clear all sessions, long-term memories, and copilot stores for a clean run.

#Next steps

Now that you have a running memory-aware AI agent, here are ways to extend it:
  • Add a new entity type. Edit data/wealth-advisor/dataset.config.json to add an entity to contextSurfaces.entities, add matching records to client-data.json, and restart. Redis Context Retriever generates new tools automatically — the agent picks them up with no code changes.
  • Seed known facts as long-term memories. Use ram.createLongTermMemories() at setup time to pre-load facts about clients. Combine MemoryType.SEMANTIC facts with MemoryType.EPISODIC events for a richer memory base.
  • Adjust the memory prompt budget. Change MEETING_MEMORY_CONTEXT_WINDOW_MAX in .env to control how many tokens are reserved for memory context in LLM calls.
  • Explore with Redis Insight. Connect Redis Insight to your local Redis instance to browse session events, topic stores, and suggestion stores as JSON documents in real time.
  • Build your own persona. The app is dataset-driven — all labels, roles, branding, suggestion types, and entity schemas come from dataset.config.json. Clone the wealth-advisor folder, edit the config, and swap in your own transcript data and entity records.

#References