Tutorial

Build a smarter real-time AI agent with Redis Iris

June 02, 202644 minute read

Prasan Rajpurohit

William Johnston

TL;DR: AI agents forget everything when a conversation ends, and can't query structured data without custom code. Redis Iris solves both. Redis Context Retriever turns your entity data into auto-generated MCP tools any agent can discover and call. Redis Agent Memory gives agents persistent session memory and cross-session long-term memory backed by vector search. Together they make Redis a full context engine for AI applications.

Redis Agent Memory Explorer wealth advisor demo showing session memory, long-term memory, and Suggestions panels

Note: This tutorial uses the code from the following git repository:

https://github.com/redis-developer/redis-agent-memory-explorer

When a wealth advisor meets with a client every month, they're expected to remember what was discussed last quarter, the client's risk tolerance, their family situation, and every commitment made across a dozen previous meetings. Without a memory system, every LLM-powered assistant starts blank. The context window fills up fast, the session ends, and everything is forgotten.

This is the memory problem and it has three distinct parts:

Conversational memory: what was said, what was decided, what the client revealed across many sessions over time
Structured knowledge: portfolio holdings, financial goals, pending action items, and other data that belongs in a queryable store, not a chat log
Live meeting assistance: during an active conversation, the advisor needs real-time nudges, which agenda topics haven't been covered yet, what the client said about this same issue last quarter, what action item just emerged, etc. without manually tracking anything

In this tutorial, you'll build the Wealth Advisor Agent Memory Explorer that solves all three using two Redis Cloud capabilities: Redis Context Retriever for structured data as auto-generated MCP tools, and Redis Agent Memory for persistent conversational memory. A LangGraph ReAct agent queries both layers to answer questions with source attribution, and a real-time suggestions pipeline surfaces live insights during meeting playback that are grounded in stored memory.

#How the two data layers compare

Before diving in, here's what each layer does and when to reach for it:

	Redis Context Retriever	Redis Agent Memory
What it stores	Structured business records: entities, fields, relationships (e.g. clients, holdings, goals)	Conversational history — session events, extracted facts, decisions, and sentiments
When to use	Questions about current facts and business records: "What are James's holdings?"	Questions about what was said, decided, or felt: "What did James say about bonds last month?"
How agents access it	Auto-generated MCP tools discovered at runtime: no hardcoded queries	SDK methods (`buildMemoryPrompt`, `searchLongTermMemory`) called from agent tools
Unique strength	Exact, complete answers: every matching record returned, no approximation	Cross-session context — connects what was said months ago to what is happening now
Real-time assistance	Provides ground-truth structured business data to back agent answers	Supplies conversation history and long-term memories so suggestions are grounded in what the client actually said
Combined	"What is James's current allocation?" → exact records	"…and what did he say about rebalancing?" → memory search: agent synthesises both in one reply

#Prerequisites

A Redis Cloud account with Redis Agent Memory and Redis Context Retriever enabled
An OpenAI API key
Docker and Docker Compose
Node.js 18+

#Setup

Clone the repo and copy the environment template:

Fill in your .env:

Start the app:

Open http://localhost:3001. On first run, the backend creates a Redis Context Retriever surface, loads entity records from data/wealth-advisor/client-data.json, and prints CTX_SURFACE_ID and MCP_AGENT_KEY to the console. Copy those values into your .env. Subsequent runs find them set, skip surface creation, and connect directly to the existing surface.

#What you'll build

The Wealth Advisor Agent Memory Explorer is a demo where Sarah Chen, a relationship manager at Acme Bank, conducts a live meeting with her client James Morrison. Rather than integrating a real meeting API, the app simulates the live call by streaming pre-recorded transcripts chunk by chunk. The focus is entirely on the Redis side: what happens to memory and context as the conversation unfolds in real time. As the meeting plays, session events are stored in Redis Agent Memory, long-term facts are extracted automatically in the background, and structured client data is accessible via Redis Context Retriever. A LangGraph ReAct chatbot can then query both layers to answer questions with source attribution.

#Architecture

Note: The app runs as two separate processes: the API server (port 3001) handles REST routes and the suggestions pipeline; the LangGraph server (port 2024) hosts the ReAct chatbot agent. In Docker these are two containers from the same image (demo-app and demo-langgraph). Because they are separate OS processes, each initializes its own Redis Agent Memory and Context Retriever client instances on startup — both connecting to the same Redis Cloud endpoint via shared environment variables.

#Tech stack

Layer	Technology
Frontend	Next.js 14 (App Router), React, CopilotKit
Backend	Node.js, Express
Chatbot agent	LangGraph/ LangChain
Memory	Redis Agent Memory (Redis Cloud)
Structured data	Redis Context Retriever (Redis Cloud)
LLM	OpenAI

#Redis Context Retriever

#What is Redis Context Retriever?

Redis Context Retriever is a Redis Cloud service that takes your structured entity data and turns it into auto-generated MCP (Model Context Protocol) tools that any agent can discover and call. You define an entity schema, load records, and Redis handles the rest, generating a full set of query tools without any custom API code.

For the wealth advisor demo, the entity schema defines four entities: Client, Holding, FinancialGoal, and ActionItem. Redis Context Retriever generates tools like filter_holding_by_asset_class, search_financialgoal_by_text, find_holding_by_current_value_range... and so on. Each self-describing, each callable by the LangGraph agent at runtime.

#Why this matters

Aspect	Traditional approach	With Redis Context Retriever
Accuracy	Vector search returns a semantically close chunk	Typed MCP tools query exact structured records
Agentic reasoning	One-shot retrieval; agent gets one answer and stops	Agent chains multiple tool calls agentic-style, narrowing and enriching the answer at each step
Runtime discovery	Hardcode entity knowledge in the agent's system prompt	Tools are self-describing; the agent discovers them at runtime
No custom API code	Build and maintain a custom API for every data source	Auto-generated MCP tools, no custom API code
Multi-surface scale	One integration per data source	One retriever (surface) per dataset; the same agent queries them all

#Key concepts

Retriever (Surface): A named retrieval service that ties a Redis data source to a multi-entity schema. Context Retriever reads that schema and auto-generates MCP tools the agent can call at runtime. You can view and manage all your surfaces in the Redis Cloud Console.
Entity schema: Defines field names, types, and descriptions. This drives tool generation.
Admin key: Used to create and manage surfaces and issue agent keys. Generate one from the Context Retriever admin keys page in Redis Cloud and set it as CTX_ADMIN_KEY in your .env.
Agent key: A scoped key the agent uses to call MCP tools, separate from the admin key used for retriever (surface) management.
MCP tools: Auto-generated and self-describing. Tool names encode the query pattern: filter_<entity>_by_<field>, search_<entity>_by_text, get_<entity>_by_id, find_<entity>_by_<field>_range.

Once the wealth-advisor surface is created, the Redis Cloud Console shows the full data model with all four entities: Client, Holding, FinancialGoal, and ActionItem. This confirms that MCP tools have been generated and are available for AI agents:

Redis Cloud Console showing the wealth-advisor context retriever surface with 4 entities and 25 auto-generated MCP tools

#What the entity schema looks like

Before creating a surface, you define an entity schema and pass it to the SDK. In this demo the schema lives in client-data.json for convenience, but it can come from any source (e.g. a separate config file, a database, or inline code). Each entity declares field names, types, descriptions, and the Redis index type for each field. This is what Redis Context Retriever reads to auto-generate the MCP tools.

Here is the Client entity (the primary entity) and the Holding entity (a related entity linked via client_id):

Note: The redisIndices field controls which query tools are generated. A tag field produces a filter_<entity>_by_<field> tool. A numeric field produces a find_<entity>_by_<field>_range tool. A text field produces a search_<entity>_by_text tool. isKeyComponent: true produces a get_<entity>_by_id tool.

#What the actual records look like

The records section of client-data.json holds the sample data that will be loaded into the surface. For the demo there is one client, James Morrison, with few holdings:

Each Holding record carries a client_id that matches the Client record, which is how filter_holding_by_client_id knows which holdings to return when the agent queries for a specific client.

#Creating a context retriever tool and loading records

Creating a surface and loading records is a one-time activity. On the first run, the backend creates the surface, generates the agent key, and prints both to the console. You copy these values into your .env as CTX_SURFACE_ID and MCP_AGENT_KEY. Every subsequent run finds them already set, skips creation entirely, and connects directly to the existing surface:

After the first run, the surface ID and agent key are printed to the console. Copy these values and set them manually in your .env:

On subsequent runs the backend reads these values from .env and skips surface creation entirely, reusing the existing retriever.

#Discovering and calling MCP tools

At agent startup, the LangGraph server fetches all available tools for the surface and wraps each one as a LangGraph DynamicStructuredTool:

Here's what's happening step by step:

listTools() fetches the current tool list from the MCP server. The tool count and names depend entirely on the entity schema, adding an entity to the schema automatically adds new tools.
buildJsonSchemaToZod() converts each tool's JSON Schema parameters into a Zod schema so LangGraph can validate inputs and generate the tool signature for the LLM.
cs.callTool() sends a JSON-RPC request to the MCP server and returns the structured result, which extractMcpText() unwraps to a plain string for the agent.

The agent now has a set of tools it can discover and call at runtime.

#Context Retriever tools in the demo

For the wealth advisor entity schema, Redis Context Retriever generates multiple tools. Here are some examples:

Tool	What it queries
`filter_holding_by_client_id`	All portfolio holdings for a client
`filter_holding_by_asset_class`	Holdings by equity, bond, or real estate class
`find_holding_by_current_value_range`	Holdings above or below a value threshold
`filter_financialgoal_by_client_id`	All financial goals for a client
`filter_financialgoal_by_type`	Goals by type (retirement, education, etc.)
`search_financialgoal_by_text`	Full-text search across goals
`filter_actionitem_by_status`	Pending or completed action items
`get_client_by_id`	Client profile by primary key

These tools are available to the agent at runtime. When a user asks a question that requires structured client data, the agent picks the right tool, calls it, and cites its source in the response.

For example, asking "What is James Morrison's portfolio allocation?" causes the agent to invoke filter_holding_by_client_id, retrieve the full holdings breakdown, and return the answer with a SOURCE: CONTEXT RETRIEVER label so the user can always see where the data came from:

Chatbot answering What is James Morrison's portfolio allocation?

Note: This is where Context Retriever has a meaningful accuracy advantage over a standard RAG approach. If the portfolio data were stored purely as embeddings and retrieved with a single vector search, the agent would get back a semantically close chunk of text, but it would have no guarantee of completeness or precision. It might miss holdings, return stale text, or conflate records from different clients.

With Context Retriever, the agent operates against structured records through typed MCP tools. It can call filter_holding_by_client_id to get every holding for a client, follow up with find_holding_by_current_value_range to narrow by value, and chain further tool calls as the question demands. The agent isn't doing a one-shot retrieval; it's navigating the data the same way a developer would query a database, just driven by the LLM's reasoning at runtime.

#Redis Agent Memory

#What is Redis Agent Memory?

Redis Agent Memory is a Redis Cloud service that gives AI agents two tiers of persistent memory:

Session memory: An ordered log of events for the current session, scoped by sessionId. Each event has a role (user, assistant, system), content, and optional metadata. Session memory has a configurable TTL; typically set in hours since it only needs to live as long as the active session. This is what lets the agent handle follow-up questions and maintain context across multiple turns in the same conversation.
Long-term memory (LTM): Cross-session, persistent facts and events extracted from conversations. Backed by vector search so agents can retrieve semantically relevant memories regardless of which session produced them. LTM also supports a configurable TTL so stale memories can expire automatically without manual cleanup. This is what prevents the agent from starting blank at the beginning of every new session.

#Memory types

Type	What it stores	Example from the demo
`episodic`	Events with context and time	"Client expressed concerns about REIT exposure in the Feb meeting"
`semantic`	Facts, preferences, profile data	"Client has a moderate risk tolerance"
`message`	Stored conversation records	Raw dialogue segments

Note: In this demo, all auto-extracted long-term memories are episodic type because they come from meeting conversations. Redis Agent Memory extracts them automatically in the background. You can also create LTMs manually via createLongTermMemories() at any time. This is the right approach when you run your own extraction pipeline: processing documents, emails, call transcripts, or any content outside of a live session and persist the resulting memories directly:

Note: Automatic extraction and manual creation end up in the same searchable LTM store.

#How to use session memory to store a live conversation

When the user presses Play on a transcript, the frontend calls the backend once per chunk. Each chunk is a timestamped dialogue turn from the meeting. The backend formats it and stores it as a session event in Redis Agent Memory:

The session ID is generated when playback starts (playback-<transcriptId>-<timestamp>) and is unique per playback run.

Reading the live session back is equally simple:

In the demo, the Session memory tab polls this every few seconds during playback to display events as they arrive.

Session memory tab showing live events during transcript playback

#How Long-term memory allows the agent to remember across sessions

After a transcript plays, Redis Agent Memory analyzes session events in the background and extracts durable facts as long-term memories. These are available for semantic search across all future sessions.

The Long-term memory tab searches LTMs by user, with optional filters for memory type and topics:

To see only the memories extracted from a specific meeting, filter by sessionId instead:

Long-term memory tab with episodic memory cards

#How to build a prompt using Redis Agent Memory

An utility method buildMemoryPrompt is used to assemble a token-budgeted context string from session events plus relevant long-term memories ready to inject directly into any LLM system prompt.

The output format is structured markdown the LLM can immediately use:

#How to query Redis Iris

The CopilotKit chatbot sidebar connects to a LangGraph ReAct agent that has tools from both data layers. This is the centrepiece of the demo: the agent receives a natural language question, reasons about which tools to call, calls them, and synthesizes an answer.

The same agent handles three distinct question types without any special-casing:

Redis Context Retriever-only question: "What is James Morrison's portfolio allocation?"

The agent calls a single MCP tool (filter_holding_by_client_id), gets back exact structured records, and returns a precise breakdown. Source badge: CONTEXT RETRIEVER.

Chatbot answering CONTEXT RETRIEVER question

Redis Agent Memory-only question: "What happened in this meeting?"

The agent calls getMemoryContext with the active session ID, which combines live session events with long-term memories into a single hydrated prompt. Source badge: RAM SESSION + LONG-TERM MEMORY.

Chatbot answering Redis Agent Memory question

Combined question: "What is James's current allocation and what did he say about rebalancing?"

The agent reasons that it needs both structured portfolio data (Context Retriever) and conversational context about what was discussed (RAM). It chains three tool calls: filter_holding_by_client_id, searchMemoriesBySession, and search_client_by_text. Then the agent synthesizes both results into a single answer. Source badge: CONTEXT RETRIEVER, RAM SESSION MEMORY.

This is the power of having two data layers wired into one agent: structured precision from Redis Context Retriever, conversational depth from Redis Agent Memory, and the LLM reasoning over both to produce a single coherent answer. The same routing logic handles any question. listSessions → getMemoryContext for a named meeting, filter_actionitem_by_status for pending tasks, find_holding_by_current_value_range for value-based portfolio queries ... and so on.

#How the agent is wired

The LangGraph graph is a single-node ReAct agent. At startup it initializes both data clients and fetches all available tools:

With both clients initialized, createAllTools() assembles the full tool list:

The final tools array contains the 5 Redis Agent Memory tools plus however many MCP tools Context Retriever generated from the entity schema (25 for the wealth-advisor demo). The agent sees them all as equal. It doesn't know or care which layer a tool queries.

#Redis Agent Memory tools

Tool	When the agent uses it
`getMemoryContext`	Primary tool: returns a full `buildMemoryPrompt` result for any question about an active session, with session events + LTM combined
`searchMemories`	Semantic search across all long-term memories, cross-session
`searchMemoriesBySession`	Search long-term memories scoped to a specific meeting
`listSessions`	When the user references a meeting by date or name
`getSessionState`	Session metadata: `event count`, `owner`, `ID`

#The dynamic system prompt

The agent's system prompt is built at startup from the dataset config and the live MCP tool definitions. buildSystemPrompt parses entity names directly from tool names (e.g. filter_holding_by_* → Holding) and inlines every tool's name and description so the agent knows exactly what is available:

This means adding a new entity to the schema automatically updates the system prompt.

Note: Source attribution is not generated by the LLM. A postProcessMessages function runs after the ReAct agent completes. It inspects the graph's message history, maps each tool name to a human-readable source label (Long-term memory, Session memory, Context Retriever, etc.), and prepends the **Source:**/<tools> header to the final AI message. The frontend parses this header into a rendered badge and collapsible tools disclosure. The LLM is explicitly told in the system prompt not to add it.

#See real-time suggestions

#Why this exists

Before a client meeting, a relationship manager already has topics they want to cover. Things like reviewing the retirement plan, discussing some contribution, or following up on a previous action item. In the demo, these are pre-seeded topics loaded at session start. During the live conversation, two things happen automatically:

Topics get tracked without any manual effort. As the transcript plays, the system detects when a pre-seeded topic gets discussed and marks it discussed. If the client introduces something new that wasn't on the agenda, it gets added to the topic list as a new topic. If the client asks a direct question about something, it's flagged as question. By the end of the meeting, the RM has a full picture of what was covered or missed without taking a single note.
The LLM acts as a live assistant. Every few transcript chunks, the suggestion pipeline fires and the LLM analyzes what was just said in context of the client's full history. It can:
- Perform a topic recall: "James raised this same concern about REITs in December"
- Flag a life event: "Client mentioned their spouse may retire early"
- Detect a sentiment shift: "Client sounds anxious about market exposure"
- Generate an agenda reminder: "Bond allocation hasn't been discussed yet"
- Spot an action item: "Client asked for a rebalancing proposal"
- Answer a client question using memory context from past meetings

The RM doesn't have to ask for any of this. It is generated automatically, grounded in Redis Agent Memory, so every insight is backed by actual stored conversation history and long-term memories rather than the LLM guessing.

#How the pipeline works

Every N chunks (configurable), the suggestion pipeline fires automatically:

Here's the core of the pipeline with how it hydrates memory context before invoking the suggestion LLM:

The LLM returns a structured JSON response with a suggestion and topic updates:

Topics follow the lifecycle described above: pre-seeded as pending → discussed when covered → question when the client asks directly → new if an unexpected topic emerges. The LLM reports these transitions in topicUpdates and the topic store merges them into the session state in real time.

The suggestion LLM is also passed all previous suggestions to prevent duplicates.

#Running the demo

Open http://localhost:3001. The app loads the wealth advisor configuration with participant roles and suggestion types from the backend.

Select a transcript from the dropdown in the left panel and click Play

Wealth advisor agent initial state showing the meeting transcript panel, empty Suggestions tab, and the four navigation tabs

Watch session memory fill in the Session memory tab as each chunk is stored — you can see the session ID, owner, and every transcript event in real time

Session memory tab showing 69 stored events for the active playback session, with timestamped messages from Sarah Chen and James Morrison

Watch long-term memories appear in the Long-term memories tab after extraction runs in the background. Episodic memories (decisions, life events) and semantic memories (facts, preferences) accumulate as the transcript plays

Long-term memories tab showing 16 episodic memories extracted from the session, including portfolio plans and personal details about the client

Check the suggestions tab for live insights and detected topics. Pre-seeded topics move from pending to discussed, new topics get added, and the LLM generates suggestion cards (question answers, action items, topic recalls) every few chunks

Suggestions tab showing 7 detected topics with timestamps, a "Question answer" suggestion card about a stock buyback program, and an "Action item" suggestion at the bottom

Open the chatbot sidebar (bottom-right button) and ask questions across both data layers. The agent routes each question to the right tool and shows the source and tools used

Memory assistant chatbot answering "What is James Morrison's portfolio allocation?" A source badge shows the CONTEXT RETRIEVER tool called: filter_holding_by_client_id

Here are some questions to try:

"What is James Morrison's portfolio allocation?"
"What equities does James hold?"
"What are James's financial goals?"
"Are there any pending action items?"
"List all holdings worth more than $500K"
"What happened in this meeting?"
"What did James say about REIT concerns?"
"What was discussed about Emily's education?"
"Summarize the Feb 26 call"
"What is James's current allocation and what did he say about rebalancing?"
"What is the retirement goal target and what has been discussed about it?"
"List everything you know about James"

Click reset to clear all sessions, long-term memories, and copilot stores for a clean run.