Your agents aren't failing. Their context is.

See how we fix it

Tutorial

Build a smarter real-time AI agent with Redis Iris

May 18, 202644 minute read
Prasan Rajpurohit
Prasan Rajpurohit
William Johnston
William Johnston
TL;DR: AI agents forget everything when a conversation ends, and can't query structured data without custom code. Redis Iris solves both. Redis Context Retriever turns your entity data into auto-generated MCP tools any agent can discover and call. Redis Agent Memory gives agents persistent session memory and cross-session long-term memory backed by vector search. Together they make Redis a full context engine for AI applications.
Redis Agent Memory Explorer wealth advisor demo showing session memory, long-term memory, and Suggestions panels
Note: This tutorial uses the code from the following git repository:
https://github.com/redis-developer/redis-agent-memory-explorer
When a wealth advisor meets with a client every month, they're expected to remember what was discussed last quarter, the client's risk tolerance, their family situation, and every commitment made across a dozen previous meetings. Without a memory system, every LLM-powered assistant starts blank. The context window fills up fast, the session ends, and everything is forgotten.
This is the memory problem and it has three distinct parts:
  • Conversational memory: what was said, what was decided, what the client revealed across many sessions over time
  • Structured knowledge: portfolio holdings, financial goals, pending action items, and other data that belongs in a queryable store, not a chat log
  • Live meeting assistance: during an active conversation, the advisor needs real-time nudges, which agenda topics haven't been covered yet, what the client said about this same issue last quarter, what action item just emerged, etc. without manually tracking anything
In this tutorial, you'll build the Wealth Advisor Agent Memory Explorer that solves all three using two Redis Cloud capabilities: Redis Context Retriever for structured data as auto-generated MCP tools, and Redis Agent Memory for persistent conversational memory. A LangGraph ReAct agent queries both layers to answer questions with source attribution, and a real-time suggestions pipeline surfaces live insights during meeting playback that are grounded in stored memory.

#How the two data layers compare

Before diving in, here's what each layer does and when to reach for it:
Redis Context RetrieverRedis Agent Memory
What it storesStructured business records: entities, fields, relationships (e.g. clients, holdings, goals)Conversational history — session events, extracted facts, decisions, and sentiments
When to useQuestions about current facts and business records: "What are James's holdings?"Questions about what was said, decided, or felt: "What did James say about bonds last month?"
How agents access itAuto-generated MCP tools discovered at runtime: no hardcoded queriesSDK methods (buildMemoryPrompt, searchLongTermMemory) called from agent tools
Unique strengthExact, complete answers: every matching record returned, no approximationCross-session context — connects what was said months ago to what is happening now
Real-time assistanceProvides ground-truth structured business data to back agent answersSupplies conversation history and long-term memories so suggestions are grounded in what the client actually said
Combined"What is James's current allocation?" → exact records"…and what did he say about rebalancing?" → memory search: agent synthesises both in one reply

#Prerequisites


#Setup

Clone the repo and copy the environment template:
Fill in your .env:
Start the app:
Open http://localhost:3001. On first run, the backend creates a Redis Context Retriever surface, loads entity records from data/wealth-advisor/client-data.json, and prints CTX_SURFACE_ID and MCP_AGENT_KEY to the console. Copy those values into your .env. Subsequent runs find them set, skip surface creation, and connect directly to the existing surface.

#What you'll build

The Wealth Advisor Agent Memory Explorer is a demo where Sarah Chen, a relationship manager at Acme Bank, conducts a live meeting with her client James Morrison. Rather than integrating a real meeting API, the app simulates the live call by streaming pre-recorded transcripts chunk by chunk. The focus is entirely on the Redis side: what happens to memory and context as the conversation unfolds in real time. As the meeting plays, session events are stored in Redis Agent Memory, long-term facts are extracted automatically in the background, and structured client data is accessible via Redis Context Retriever. A LangGraph ReAct chatbot can then query both layers to answer questions with source attribution.

#Architecture

Note: The app runs as two separate processes: the API server (port 3001) handles REST routes and the suggestions pipeline; the LangGraph server (port 2024) hosts the ReAct chatbot agent. In Docker these are two containers from the same image (demo-app and demo-langgraph). Both initialize their own Redis Agent Memory and Context Retriever clients independently on startup.

#Tech stack

LayerTechnology
FrontendNext.js 14 (App Router), React, CopilotKit
BackendNode.js, Express
Chatbot agentLangGraph, LangChain, createReactAgent
MemoryRedis Agent Memory (Redis Cloud)
Structured dataRedis Context Retriever (Redis Cloud)
LLMOpenAI

#Redis Context Retriever

#What is Redis Context Retriever?

Redis Context Retriever is a Redis Cloud service that takes your structured entity data and turns it into auto-generated MCP (Model Context Protocol) tools that any agent can discover and call. You define an entity schema, load records, and Redis handles the rest, generating a full set of query tools without any custom API code.
For the wealth advisor demo, the entity schema defines four entities: Client, Holding, FinancialGoal, and ActionItem. Redis Context Retriever generates tools like filter_holding_by_asset_class, search_financialgoal_by_text, and find_holding_by_current_value_range. Each self-describing, each callable by the LangGraph agent at runtime.

#Why this matters

AspectTraditional approachWith Redis Context Retriever
AccuracyVector search returns a semantically close chunkTyped MCP tools query exact structured records
Agentic reasoningOne-shot retrieval; agent gets one answer and stopsAgent chains multiple tool calls agentic-style, narrowing and enriching the answer at each step
Runtime discoveryHardcode entity knowledge in the agent's system promptTools are self-describing; the agent discovers them at runtime
No custom API codeBuild and maintain a custom API for every data sourceAuto-generated MCP tools, no custom API code
Schema flexibilityAdding a new queryable field requires code changesUpdate the schema and reload records
Multi-surface scaleOne integration per data sourceOne retriever (surface) per dataset; the same agent queries them all

#Key concepts

  • Retriever (Surface): A named collection tied to a data source (Redis) and an entity schema. Think of it as a queryable namespace. You can view and manage all your surfaces in the Redis Cloud Console.
  • Entity schema: Defines field names, types, and descriptions. This drives tool generation.
  • Admin key: Used to create and manage surfaces and issue agent keys. Generate one from the Context Retriever admin keys page in Redis Cloud and set it as CTX_ADMIN_KEY in your .env.
  • Agent key: A scoped key the agent uses to call MCP tools, separate from the admin key used for retriever (surface) management.
  • MCP tools: Auto-generated and self-describing. Tool names encode the query pattern: filter_<entity>_by_<field>, search_<entity>_by_text, get_<entity>_by_id, find_<entity>_by_<field>_range.
Once the wealth-advisor surface is created, the Redis Cloud Console shows the full data model with all four entities: Client, Holding, FinancialGoal, and ActionItem. This confirms that MCP tools have been generated and are available for AI agents:
Redis Cloud Console showing the wealth-advisor context retriever surface with 4 entities and 25 auto-generated MCP tools

#What the entity schema looks like

Before creating a surface, you define an entity schema and pass it to the SDK. In this demo the schema lives in client-data.json for convenience, but it can come from any source (e.g. a separate config file, a database, or inline code). Each entity declares field names, types, descriptions, and the Redis index type for each field. This is what Redis Context Retriever reads to auto-generate the MCP tools.
Here is the Client entity (the primary entity) and the Holding entity (a related entity linked via client_id):
Note: The redisIndices field controls which query tools are generated. A tag field produces a filter_<entity>_by_<field> tool. A numeric field produces a find_<entity>_by_<field>_range tool. A text field produces a search_<entity>_by_text tool. isKeyComponent: true produces a get_<entity>_by_id tool.

#What the actual records look like

The records section of client-data.json holds the sample data that will be loaded into the surface. For the demo there is one client, James Morrison, with few holdings:
Each Holding record carries a client_id that matches the Client record, which is how filter_holding_by_client_id knows which holdings to return when the agent queries for a specific client.

#Creating a context retriever tool and loading records

Creating a surface and loading records is a one-time activity. On the first run, the backend creates the surface, generates the agent key, and prints both to the console. You copy these values into your .env as CTX_SURFACE_ID and MCP_AGENT_KEY. Every subsequent run finds them already set, skips creation entirely, and connects directly to the existing surface:
After the first run, the surface ID and agent key are printed to the console. Copy these values and set them manually in your .env:
On subsequent runs the backend reads these values from .env and skips surface creation entirely, reusing the existing retriever.

#Discovering and calling MCP tools

At agent startup, the LangGraph server fetches all available tools for the surface and wraps each one as a LangGraph DynamicStructuredTool:
Here's what's happening step by step:
  1. listTools() fetches the current tool list from the MCP server. The tool count and names depend entirely on the entity schema, adding an entity to the schema automatically adds new tools.
  2. buildJsonSchemaToZod() converts each tool's JSON Schema parameters into a Zod schema so LangGraph can validate inputs and generate the tool signature for the LLM.
  3. cs.callTool() sends a JSON-RPC request to the MCP server and returns the structured result, which extractMcpText() unwraps to a plain string for the agent.
The agent now has a set of tools it discovered at runtime.

#Context Retriever tools in the demo

For the wealth advisor entity schema, Redis Context Retriever generates multiple tools. Here are some examples:
ToolWhat it queries
filter_holding_by_client_idAll portfolio holdings for a client
filter_holding_by_asset_classHoldings by equity, bond, or real estate class
find_holding_by_current_value_rangeHoldings above or below a value threshold
filter_financialgoal_by_client_idAll financial goals for a client
filter_financialgoal_by_typeGoals by type (retirement, education, etc.)
search_financialgoal_by_textFull-text search across goals
filter_actionitem_by_statusPending or completed action items
get_client_by_idClient profile by primary key
These tools are available to the agent at runtime. When a user asks a question that requires structured client data, the agent picks the right tool, calls it, and cites its source in the response.
For example, asking "What is James Morrison's portfolio allocation?" causes the agent to invoke filter_holding_by_client_id, retrieve the full holdings breakdown, and return the answer with a SOURCE: CONTEXT RETRIEVER label so the user can always see where the data came from:
Chatbot answering What is James Morrison's portfolio allocation?
Note: This is where Context Retriever has a meaningful accuracy advantage over a standard RAG approach. If the portfolio data were stored purely as embeddings and retrieved with a single vector search, the agent would get back a semantically close chunk of text, but it would have no guarantee of completeness or precision. It might miss holdings, return stale text, or conflate records from different clients.
With Context Retriever, the agent operates against structured records through typed MCP tools. It can call filter_holding_by_client_id to get every holding for a client, follow up with find_holding_by_current_value_range to narrow by value, and chain further tool calls as the question demands. The agent isn't doing a one-shot retrieval; it's navigating the data the same way a developer would query a database, just driven by the LLM's reasoning at runtime.

#Redis Agent Memory

#What is Redis Agent Memory?

Redis Agent Memory is a Redis Cloud service that gives AI agents two tiers of persistent memory:
  • Session memory: An ordered log of events for the current session, scoped by sessionId. Each event has a role (user, assistant, system), content, and optional metadata. Session memory has a configurable TTL; typically set in hours since it only needs to live as long as the active session.
  • Long-term memory (LTM): Cross-session, persistent facts and events extracted from conversations. Backed by vector search so agents can retrieve semantically relevant memories regardless of which session produced them. LTM also supports a configurable TTL so stale memories can expire automatically without manual cleanup.

#Memory types

TypeWhat it storesExample from the demo
episodicEvents with context and time"Client expressed concerns about REIT exposure in the Feb meeting"
semanticFacts, preferences, profile data"Client has a moderate risk tolerance"
messageStored conversation recordsRaw dialogue segments
Note: In this demo, all auto-extracted long-term memories are episodic type because they come from meeting conversations. Redis Agent Memory extracts them automatically in the background. You can also create LTMs manually via createLongTermMemories() at any time. This is the right approach when you run your own extraction pipeline: processing documents, emails, call transcripts, or any content outside of a live session and persist the resulting memories directly:
Note: Automatic extraction and manual creation end up in the same searchable LTM store.

#How to use session memory to store a live conversation

When the user presses Play on a transcript, the frontend calls the backend once per chunk. Each chunk is a timestamped dialogue turn from the meeting. The backend formats it and stores it as a session event in Redis Agent Memory:
The session ID is generated when playback starts (playback-<transcriptId>-<timestamp>) and is unique per playback run.
Reading the live session back is equally simple:
In the demo, the Session memory tab polls this every few seconds during playback to display events as they arrive.
Session memory tab showing live events during transcript playback

#How Long-term memory allows the agent to remember across sessions

After a transcript plays, Redis Agent Memory analyzes session events in the background and extracts durable facts as long-term memories. These are available for semantic search across all future sessions.
The Long-term memory tab searches LTMs by user, with optional filters for memory type and topics:
To see only the memories extracted from a specific meeting, filter by sessionId instead:
Long-term memory tab with episodic memory cards

#How to build a prompt using Redis Agent Memory

An utility method buildMemoryPrompt is used to assemble a token-budgeted context string from session events plus relevant long-term memories ready to inject directly into any LLM system prompt.
The output format is structured markdown the LLM can immediately use:

#How to query Redis Iris

The CopilotKit chatbot sidebar connects to a LangGraph ReAct agent that has tools from both data layers. This is the centrepiece of the demo: the agent receives a natural language question, reasons about which tools to call, calls them, and synthesizes an answer.
The same agent handles three distinct question types without any special-casing:
Redis Context Retriever-only question: "What is James Morrison's portfolio allocation?"
The agent calls a single MCP tool (filter_holding_by_client_id), gets back exact structured records, and returns a precise breakdown. Source badge: CONTEXT RETRIEVER.
Chatbot answering CONTEXT RETRIEVER question

Redis Agent Memory-only question: "What happened in this meeting?"
The agent calls getMemoryContext with the active session ID, which combines live session events with long-term memories into a single hydrated prompt. Source badge: RAM SESSION + LONG-TERM MEMORY.
Chatbot answering Redis Agent Memory question

Combined question: "What is James's current allocation and what did he say about rebalancing?"
The agent reasons that it needs both structured portfolio data (Context Retriever) and conversational context about what was discussed (RAM). It chains three tool calls: filter_holding_by_client_id, searchMemoriesBySession, and search_client_by_text. Then the agent synthesizes both results into a single answer. Source badge: CONTEXT RETRIEVER, RAM SESSION MEMORY.
Chatbot answering combined question
This is the power of having two data layers wired into one agent: structured precision from Redis Context Retriever, conversational depth from Redis Agent Memory, and the LLM reasoning over both to produce a single coherent answer. The same routing logic handles any question. listSessions → getMemoryContext for a named meeting, filter_actionitem_by_status for pending tasks, find_holding_by_current_value_range for value-based portfolio queries.

#How the agent is wired

The LangGraph graph is a single-node ReAct agent. At startup it initializes both data clients and fetches all available tools:
With both clients initialized, createAllTools() assembles the full tool list:
The final tools array contains the 5 Redis Agent Memory tools plus however many MCP tools Context Retriever generated from the entity schema (25 for the wealth-advisor demo). The agent sees them all as equal. It doesn't know or care which layer a tool queries.

#Redis Agent Memory tools

ToolWhen the agent uses it
getMemoryContextPrimary tool: returns a full buildMemoryPrompt result for any question about an active session, with session events + LTM combined
searchMemoriesSemantic search across all long-term memories, cross-session
searchMemoriesBySessionSearch long-term memories scoped to a specific meeting
listSessionsWhen the user references a meeting by date or name
getSessionStateSession metadata: event count, owner, ID

#The dynamic system prompt

The agent's system prompt is built at startup from the dataset config and the live MCP tool definitions. buildSystemPrompt parses entity names directly from tool names (e.g. filter_holding_by_*Holding) and inlines every tool's name and description so the agent knows exactly what is available:
This means adding a new entity to the schema automatically updates the system prompt.
Note: Source attribution is not generated by the LLM. A postProcessMessages function runs after the ReAct agent completes. It inspects the graph's message history, maps each tool name to a human-readable source label (Long-term memory, Session memory, Context Retriever, etc.), and prepends the **Source:**/<tools> header to the final AI message. The frontend parses this header into a rendered badge and collapsible tools disclosure. The LLM is explicitly told in the system prompt not to add it.

#See real-time suggestions

#Why this exists

Before a client meeting, a relationship manager already has topics they want to cover. Things like reviewing the retirement plan, discussing some contribution, or following up on a previous action item. In the demo, these are pre-seeded topics loaded at session start. During the live conversation, two things happen automatically:
  1. Topics get tracked without any manual effort. As the transcript plays, the system detects when a pre-seeded topic gets discussed and marks it discussed. If the client introduces something new that wasn't on the agenda, it gets added to the topic list as a new topic. If the client asks a direct question about something, it's flagged as question. By the end of the meeting, the RM has a full picture of what was covered or missed without taking a single note.
  2. The LLM acts as a live assistant. Every few transcript chunks, the suggestion pipeline fires and the LLM analyzes what was just said in context of the client's full history. It can:
    • Perform a topic recall: "James raised this same concern about REITs in December"
    • Flag a life event: "Client mentioned their spouse may retire early"
    • Detect a sentiment shift: "Client sounds anxious about market exposure"
    • Generate an agenda reminder: "Bond allocation hasn't been discussed yet"
    • Spot an action item: "Client asked for a rebalancing proposal"
    • Answer a client question using memory context from past meetings
The RM doesn't have to ask for any of this. It is generated automatically, grounded in Redis Agent Memory, so every insight is backed by actual stored conversation history and long-term memories rather than the LLM guessing.

#How the pipeline works

Every N chunks (configurable), the suggestion pipeline fires automatically:
Here's the core of the pipeline with how it hydrates memory context before invoking the suggestion LLM:
The LLM returns a structured JSON response with a suggestion and topic updates:
Topics follow the lifecycle described above: pre-seeded as pendingdiscussed when covered → question when the client asks directly → new if an unexpected topic emerges. The LLM reports these transitions in topicUpdates and the topic store merges them into the session state in real time.
The suggestion LLM is also passed all previous suggestions to prevent duplicates.
Suggestions tab

#Running the demo

Open http://localhost:3001. The app loads the wealth advisor configuration with participant roles and suggestion types from the backend.
  1. Select a transcript from the dropdown in the left panel and click Play
Wealth advisor agent initial state showing the meeting transcript panel, empty Suggestions tab, and the four navigation tabs
  1. Watch session memory fill in the Session memory tab as each chunk is stored — you can see the session ID, owner, and every transcript event in real time
Session memory tab showing 69 stored events for the active playback session, with timestamped messages from Sarah Chen and James Morrison
  1. Watch long-term memories appear in the Long-term memories tab after extraction runs in the background. Episodic memories (decisions, life events) and semantic memories (facts, preferences) accumulate as the transcript plays
Long-term memories tab showing 16 episodic memories extracted from the session, including portfolio plans and personal details about the client
  1. Check the suggestions tab for live insights and detected topics. Pre-seeded topics move from pending to discussed, new topics get added, and the LLM generates suggestion cards (question answers, action items, topic recalls) every few chunks
Suggestions tab showing 7 detected topics with timestamps, a "Question answer" suggestion card about a stock buyback program, and an "Action item" suggestion at the bottom
  1. Open the chatbot sidebar (bottom-right button) and ask questions across both data layers. The agent routes each question to the right tool and shows the source and tools used
Memory assistant chatbot answering "What is James Morrison's portfolio allocation?" A source badge shows the CONTEXT RETRIEVER tool called: filter_holding_by_client_id
Here are some questions to try:
  • "What is James Morrison's portfolio allocation?"
  • "What equities does James hold?"
  • "What are James's financial goals?"
  • "Are there any pending action items?"
  • "List all holdings worth more than $500K"
  • "What happened in this meeting?"
  • "What did James say about REIT concerns?"
  • "What was discussed about Emily's education?"
  • "Summarize the Feb 26 call"
  • "What is James's current allocation and what did he say about rebalancing?"
  • "What is the retirement goal target and what has been discussed about it?"
  • "List everything you know about James"
Click reset to clear all sessions, long-term memories, and copilot stores for a clean run.

#Next steps