Tutorial
Build a smarter real-time AI agent with Redis Iris
May 18, 202644 minute read
TL;DR: AI agents forget everything when a conversation ends, and can't query structured data without custom code. Redis Iris solves both. Redis Context Retriever turns your entity data into auto-generated MCP tools any agent can discover and call. Redis Agent Memory gives agents persistent session memory and cross-session long-term memory backed by vector search. Together they make Redis a full context engine for AI applications.

Note: This tutorial uses the code from the following git repository:https://github.com/redis-developer/redis-agent-memory-explorer
When a wealth advisor meets with a client every month, they're expected to remember what was discussed last quarter, the client's risk tolerance, their family situation, and every commitment made across a dozen previous meetings. Without a memory system, every LLM-powered assistant starts blank. The context window fills up fast, the session ends, and everything is forgotten.
This is the memory problem and it has three distinct parts:
- Conversational memory: what was said, what was decided, what the client revealed across many sessions over time
- Structured knowledge: portfolio holdings, financial goals, pending action items, and other data that belongs in a queryable store, not a chat log
- Live meeting assistance: during an active conversation, the advisor needs real-time nudges, which agenda topics haven't been covered yet, what the client said about this same issue last quarter, what action item just emerged, etc. without manually tracking anything
In this tutorial, you'll build the Wealth Advisor Agent Memory Explorer that solves all three using two Redis Cloud capabilities: Redis Context Retriever for structured data as auto-generated MCP tools, and Redis Agent Memory for persistent conversational memory. A LangGraph ReAct agent queries both layers to answer questions with source attribution, and a real-time suggestions pipeline surfaces live insights during meeting playback that are grounded in stored memory.
#How the two data layers compare
Before diving in, here's what each layer does and when to reach for it:
| Redis Context Retriever | Redis Agent Memory | |
|---|---|---|
| What it stores | Structured business records: entities, fields, relationships (e.g. clients, holdings, goals) | Conversational history — session events, extracted facts, decisions, and sentiments |
| When to use | Questions about current facts and business records: "What are James's holdings?" | Questions about what was said, decided, or felt: "What did James say about bonds last month?" |
| How agents access it | Auto-generated MCP tools discovered at runtime: no hardcoded queries | SDK methods (buildMemoryPrompt, searchLongTermMemory) called from agent tools |
| Unique strength | Exact, complete answers: every matching record returned, no approximation | Cross-session context — connects what was said months ago to what is happening now |
| Real-time assistance | Provides ground-truth structured business data to back agent answers | Supplies conversation history and long-term memories so suggestions are grounded in what the client actually said |
| Combined | "What is James's current allocation?" → exact records | "…and what did he say about rebalancing?" → memory search: agent synthesises both in one reply |
#Prerequisites
- A Redis Cloud account with Redis Agent Memory and Redis Context Retriever enabled
- An OpenAI API key
- Docker and Docker Compose
- Node.js 18+
#Setup
Clone the repo and copy the environment template:
Fill in your
.env:Start the app:
Open
http://localhost:3001. On first run, the backend creates a Redis Context Retriever surface, loads entity records from data/wealth-advisor/client-data.json, and prints CTX_SURFACE_ID and MCP_AGENT_KEY to the console. Copy those values into your .env. Subsequent runs find them set, skip surface creation, and connect directly to the existing surface.#What you'll build
The Wealth Advisor Agent Memory Explorer is a demo where Sarah Chen, a relationship manager at Acme Bank, conducts a live meeting with her client James Morrison. Rather than integrating a real meeting API, the app simulates the live call by streaming pre-recorded transcripts chunk by chunk. The focus is entirely on the Redis side: what happens to memory and context as the conversation unfolds in real time. As the meeting plays, session events are stored in Redis Agent Memory, long-term facts are extracted automatically in the background, and structured client data is accessible via Redis Context Retriever. A LangGraph ReAct chatbot can then query both layers to answer questions with source attribution.
#Architecture
Note: The app runs as two separate processes: the API server (port 3001) handles REST routes and the suggestions pipeline; the LangGraph server (port 2024) hosts the ReAct chatbot agent. In Docker these are two containers from the same image (demo-appanddemo-langgraph). Both initialize their own Redis Agent Memory and Context Retriever clients independently on startup.
#Tech stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 14 (App Router), React, CopilotKit |
| Backend | Node.js, Express |
| Chatbot agent | LangGraph, LangChain, createReactAgent |
| Memory | Redis Agent Memory (Redis Cloud) |
| Structured data | Redis Context Retriever (Redis Cloud) |
| LLM | OpenAI |
#Redis Context Retriever
#What is Redis Context Retriever?
Redis Context Retriever is a Redis Cloud service that takes your structured entity data and turns it into auto-generated MCP (Model Context Protocol) tools that any agent can discover and call. You define an entity schema, load records, and Redis handles the rest, generating a full set of query tools without any custom API code.
For the wealth advisor demo, the entity schema defines four entities:
Client, Holding, FinancialGoal, and ActionItem. Redis Context Retriever generates tools like filter_holding_by_asset_class, search_financialgoal_by_text, and find_holding_by_current_value_range. Each self-describing, each callable by the LangGraph agent at runtime.#Why this matters
| Aspect | Traditional approach | With Redis Context Retriever |
|---|---|---|
| Accuracy | Vector search returns a semantically close chunk | Typed MCP tools query exact structured records |
| Agentic reasoning | One-shot retrieval; agent gets one answer and stops | Agent chains multiple tool calls agentic-style, narrowing and enriching the answer at each step |
| Runtime discovery | Hardcode entity knowledge in the agent's system prompt | Tools are self-describing; the agent discovers them at runtime |
| No custom API code | Build and maintain a custom API for every data source | Auto-generated MCP tools, no custom API code |
| Schema flexibility | Adding a new queryable field requires code changes | Update the schema and reload records |
| Multi-surface scale | One integration per data source | One retriever (surface) per dataset; the same agent queries them all |
#Key concepts
- Retriever (Surface): A named collection tied to a data source (Redis) and an entity schema. Think of it as a queryable namespace. You can view and manage all your surfaces in the Redis Cloud Console.
- Entity schema: Defines field names, types, and descriptions. This drives tool generation.
- Admin key: Used to create and manage surfaces and issue agent keys. Generate one from the Context Retriever admin keys page in Redis Cloud and set it as
CTX_ADMIN_KEYin your.env. - Agent key: A scoped key the agent uses to call MCP tools, separate from the admin key used for retriever (surface) management.
- MCP tools: Auto-generated and self-describing. Tool names encode the query pattern:
filter_<entity>_by_<field>,search_<entity>_by_text,get_<entity>_by_id,find_<entity>_by_<field>_range.
Once the wealth-advisor surface is created, the Redis Cloud Console shows the full data model with all four entities:
Client, Holding, FinancialGoal, and ActionItem. This confirms that MCP tools have been generated and are available for AI agents:
#What the entity schema looks like
Before creating a surface, you define an entity schema and pass it to the SDK. In this demo the schema lives in
client-data.json for convenience, but it can come from any source (e.g. a separate config file, a database, or inline code). Each entity declares field names, types, descriptions, and the Redis index type for each field. This is what Redis Context Retriever reads to auto-generate the MCP tools.Here is the
Client entity (the primary entity) and the Holding entity (a related entity linked via client_id):Note: TheredisIndicesfield controls which query tools are generated. Atagfield produces afilter_<entity>_by_<field>tool. Anumericfield produces afind_<entity>_by_<field>_rangetool. Atextfield produces asearch_<entity>_by_texttool.isKeyComponent: trueproduces aget_<entity>_by_idtool.
#What the actual records look like
The
records section of client-data.json holds the sample data that will be loaded into the surface. For the demo there is one client, James Morrison, with few holdings:Each
Holding record carries a client_id that matches the Client record, which is how filter_holding_by_client_id knows which holdings to return when the agent queries for a specific client.#Creating a context retriever tool and loading records
Creating a surface and loading records is a one-time activity. On the first run, the backend creates the surface, generates the agent key, and prints both to the console. You copy these values into your
.env as CTX_SURFACE_ID and MCP_AGENT_KEY. Every subsequent run finds them already set, skips creation entirely, and connects directly to the existing surface:After the first run, the surface ID and agent key are printed to the console. Copy these values and set them manually in your
.env:On subsequent runs the backend reads these values from
.env and skips surface creation entirely, reusing the existing retriever.#Discovering and calling MCP tools
At agent startup, the LangGraph server fetches all available tools for the surface and wraps each one as a LangGraph
DynamicStructuredTool:Here's what's happening step by step:
listTools()fetches the current tool list from the MCP server. The tool count and names depend entirely on the entity schema, adding an entity to the schema automatically adds new tools.buildJsonSchemaToZod()converts each tool's JSON Schema parameters into a Zod schema so LangGraph can validate inputs and generate the tool signature for the LLM.cs.callTool()sends a JSON-RPC request to the MCP server and returns the structured result, whichextractMcpText()unwraps to a plain string for the agent.
The agent now has a set of tools it discovered at runtime.
#Context Retriever tools in the demo
For the wealth advisor entity schema, Redis Context Retriever generates multiple tools. Here are some examples:
| Tool | What it queries |
|---|---|
filter_holding_by_client_id | All portfolio holdings for a client |
filter_holding_by_asset_class | Holdings by equity, bond, or real estate class |
find_holding_by_current_value_range | Holdings above or below a value threshold |
filter_financialgoal_by_client_id | All financial goals for a client |
filter_financialgoal_by_type | Goals by type (retirement, education, etc.) |
search_financialgoal_by_text | Full-text search across goals |
filter_actionitem_by_status | Pending or completed action items |
get_client_by_id | Client profile by primary key |
These tools are available to the agent at runtime. When a user asks a question that requires structured client data, the agent picks the right tool, calls it, and cites its source in the response.
For example, asking "What is James Morrison's portfolio allocation?" causes the agent to invoke
filter_holding_by_client_id, retrieve the full holdings breakdown, and return the answer with a SOURCE: CONTEXT RETRIEVER label so the user can always see where the data came from:
Note: This is where Context Retriever has a meaningful accuracy advantage over a standard RAG approach. If the portfolio data were stored purely as embeddings and retrieved with a single vector search, the agent would get back a semantically close chunk of text, but it would have no guarantee of completeness or precision. It might miss holdings, return stale text, or conflate records from different clients.With Context Retriever, the agent operates against structured records through typed MCP tools. It can callfilter_holding_by_client_idto get every holding for a client, follow up withfind_holding_by_current_value_rangeto narrow by value, and chain further tool calls as the question demands. The agent isn't doing a one-shot retrieval; it's navigating the data the same way a developer would query a database, just driven by the LLM's reasoning at runtime.
#Redis Agent Memory
#What is Redis Agent Memory?
Redis Agent Memory is a Redis Cloud service that gives AI agents two tiers of persistent memory:
- Session memory: An ordered log of events for the current session, scoped by
sessionId. Each event has a role (user,assistant,system), content, and optional metadata. Session memory has a configurable TTL; typically set in hours since it only needs to live as long as the active session. - Long-term memory (LTM): Cross-session, persistent facts and events extracted from conversations. Backed by vector search so agents can retrieve semantically relevant memories regardless of which session produced them. LTM also supports a configurable TTL so stale memories can expire automatically without manual cleanup.
#Memory types
| Type | What it stores | Example from the demo |
|---|---|---|
episodic | Events with context and time | "Client expressed concerns about REIT exposure in the Feb meeting" |
semantic | Facts, preferences, profile data | "Client has a moderate risk tolerance" |
message | Stored conversation records | Raw dialogue segments |
Note: In this demo, all auto-extracted long-term memories areepisodictype because they come from meeting conversations. Redis Agent Memory extracts them automatically in the background. You can also create LTMs manually viacreateLongTermMemories()at any time. This is the right approach when you run your own extraction pipeline: processing documents, emails, call transcripts, or any content outside of a live session and persist the resulting memories directly:
Note: Automatic extraction and manual creation end up in the same searchable LTM store.
#How to use session memory to store a live conversation
When the user presses Play on a transcript, the frontend calls the backend once per chunk. Each chunk is a timestamped dialogue turn from the meeting. The backend formats it and stores it as a session event in Redis Agent Memory:
The session ID is generated when playback starts (
playback-<transcriptId>-<timestamp>) and is unique per playback run.Reading the live session back is equally simple:
In the demo, the Session memory tab polls this every few seconds during playback to display events as they arrive.

#How Long-term memory allows the agent to remember across sessions
After a transcript plays, Redis Agent Memory analyzes session events in the background and extracts durable facts as long-term memories. These are available for semantic search across all future sessions.
The Long-term memory tab searches LTMs by user, with optional filters for memory type and topics:
To see only the memories extracted from a specific meeting, filter by
sessionId instead:
#How to build a prompt using Redis Agent Memory
An utility method
buildMemoryPrompt is used to assemble a token-budgeted context string from session events plus relevant long-term memories ready to inject directly into any LLM system prompt.The output format is structured markdown the LLM can immediately use:
#How to query Redis Iris
The CopilotKit chatbot sidebar connects to a LangGraph ReAct agent that has tools from both data layers. This is the centrepiece of the demo: the agent receives a natural language question, reasons about which tools to call, calls them, and synthesizes an answer.
The same agent handles three distinct question types without any special-casing:
Redis Context Retriever-only question: "What is James Morrison's portfolio allocation?"
The agent calls a single MCP tool (
filter_holding_by_client_id), gets back exact structured records, and returns a precise breakdown. Source badge: CONTEXT RETRIEVER.
Redis Agent Memory-only question: "What happened in this meeting?"
The agent calls
getMemoryContext with the active session ID, which combines live session events with long-term memories into a single hydrated prompt. Source badge: RAM SESSION + LONG-TERM MEMORY.
Combined question: "What is James's current allocation and what did he say about rebalancing?"
The agent reasons that it needs both structured portfolio data (Context Retriever) and conversational context about what was discussed (RAM). It chains three tool calls:
filter_holding_by_client_id, searchMemoriesBySession, and search_client_by_text. Then the agent synthesizes both results into a single answer. Source badge: CONTEXT RETRIEVER, RAM SESSION MEMORY.
This is the power of having two data layers wired into one agent: structured precision from Redis Context Retriever, conversational depth from Redis Agent Memory, and the LLM reasoning over both to produce a single coherent answer. The same routing logic handles any question.
listSessions → getMemoryContext for a named meeting, filter_actionitem_by_status for pending tasks, find_holding_by_current_value_range for value-based portfolio queries.#How the agent is wired
The LangGraph graph is a single-node ReAct agent. At startup it initializes both data clients and fetches all available tools:
With both clients initialized,
createAllTools() assembles the full tool list:The final
tools array contains the 5 Redis Agent Memory tools plus however many MCP tools Context Retriever generated from the entity schema (25 for the wealth-advisor demo). The agent sees them all as equal. It doesn't know or care which layer a tool queries.#Redis Agent Memory tools
| Tool | When the agent uses it |
|---|---|
getMemoryContext | Primary tool: returns a full buildMemoryPrompt result for any question about an active session, with session events + LTM combined |
searchMemories | Semantic search across all long-term memories, cross-session |
searchMemoriesBySession | Search long-term memories scoped to a specific meeting |
listSessions | When the user references a meeting by date or name |
getSessionState | Session metadata: event count, owner, ID |
#The dynamic system prompt
The agent's system prompt is built at startup from the dataset config and the live MCP tool definitions.
buildSystemPrompt parses entity names directly from tool names (e.g. filter_holding_by_* → Holding) and inlines every tool's name and description so the agent knows exactly what is available:This means adding a new entity to the schema automatically updates the system prompt.
Note: Source attribution is not generated by the LLM. ApostProcessMessagesfunction runs after the ReAct agent completes. It inspects the graph's message history, maps each tool name to a human-readable source label (Long-term memory,Session memory,Context Retriever, etc.), and prepends the**Source:**/<tools>header to the final AI message. The frontend parses this header into a rendered badge and collapsible tools disclosure. The LLM is explicitly told in the system prompt not to add it.
#See real-time suggestions
#Why this exists
Before a client meeting, a relationship manager already has topics they want to cover. Things like reviewing the retirement plan, discussing some contribution, or following up on a previous action item. In the demo, these are pre-seeded topics loaded at session start. During the live conversation, two things happen automatically:
-
Topics get tracked without any manual effort. As the transcript plays, the system detects when a pre-seeded topic gets discussed and marks it
discussed. If the client introduces something new that wasn't on the agenda, it gets added to the topic list as anewtopic. If the client asks a direct question about something, it's flagged asquestion. By the end of the meeting, the RM has a full picture of what was covered or missed without taking a single note. -
The LLM acts as a live assistant. Every few transcript chunks, the suggestion pipeline fires and the LLM analyzes what was just said in context of the client's full history. It can:
- Perform a topic recall: "James raised this same concern about REITs in December"
- Flag a life event: "Client mentioned their spouse may retire early"
- Detect a sentiment shift: "Client sounds anxious about market exposure"
- Generate an agenda reminder: "Bond allocation hasn't been discussed yet"
- Spot an action item: "Client asked for a rebalancing proposal"
- Answer a client question using memory context from past meetings
The RM doesn't have to ask for any of this. It is generated automatically, grounded in Redis Agent Memory, so every insight is backed by actual stored conversation history and long-term memories rather than the LLM guessing.
#How the pipeline works
Every N chunks (configurable), the suggestion pipeline fires automatically:
Here's the core of the pipeline with how it hydrates memory context before invoking the suggestion LLM:
The LLM returns a structured JSON response with a suggestion and topic updates:
Topics follow the lifecycle described above: pre-seeded as
pending → discussed when covered → question when the client asks directly → new if an unexpected topic emerges. The LLM reports these transitions in topicUpdates and the topic store merges them into the session state in real time.The suggestion LLM is also passed all previous suggestions to prevent duplicates.

#Running the demo
Open
http://localhost:3001. The app loads the wealth advisor configuration with participant roles and suggestion types from the backend.- Select a transcript from the dropdown in the left panel and click Play

- Watch session memory fill in the Session memory tab as each chunk is stored — you can see the session ID, owner, and every transcript event in real time

- Watch long-term memories appear in the Long-term memories tab after extraction runs in the background. Episodic memories (decisions, life events) and semantic memories (facts, preferences) accumulate as the transcript plays

- Check the suggestions tab for live insights and detected topics. Pre-seeded topics move from pending to discussed, new topics get added, and the LLM generates suggestion cards (question answers, action items, topic recalls) every few chunks

- Open the chatbot sidebar (bottom-right button) and ask questions across both data layers. The agent routes each question to the right tool and shows the source and tools used

Here are some questions to try:
- "What is James Morrison's portfolio allocation?"
- "What equities does James hold?"
- "What are James's financial goals?"
- "Are there any pending action items?"
- "List all holdings worth more than $500K"
- "What happened in this meeting?"
- "What did James say about REIT concerns?"
- "What was discussed about Emily's education?"
- "Summarize the Feb 26 call"
- "What is James's current allocation and what did he say about rebalancing?"
- "What is the retirement goal target and what has been discussed about it?"
- "List everything you know about James"
Click reset to clear all sessions, long-term memories, and copilot stores for a clean run.

