# Context poisoning: how bad information breaks agent reasoning

**Tagline:** News & Media | **Authors:** Jim Allen Wallace | **Categories:** Tech DE | **Published:** 2026-05-17 | **Updated:** 2026-05-21

Your agent confidently tells a customer their order shipped two days ago. It didn't. The order was canceled last week, but a stale cache entry surfaced in the agent's context window, and the agent treated that outdated status as fact. Worse, it then wrote the "confirmed shipment" into its memory, so every future interaction about that order will reference the same wrong information as verified truth.

This is context poisoning, and it's one of the hardest failure modes to catch in agentic AI because the agent's reasoning looks perfectly coherent the entire time. The logic is fine. The data feeding it is not. This article explains what context poisoning is, how it spreads through agent systems, where it shows up in practice, and how [Redis Iris](https://redis.io/iris/) can help keep agent context fresher and more structured.

## **What context poisoning is (& what it is not)**

Context poisoning occurs when bad information is reused repeatedly. Once bad information enters an agent's active context window or persistent memory, the agent treats it as ground truth. Every subsequent reasoning step, plan, and action builds on that corrupted foundation.

One misleading snippet is enough. It can [spread through reasoning](https://arxiv.org/html/2605.00505v2), turning retrieval noise into stacked hallucinations that derail the task. And unlike a one-off hallucination that disappears when the conversation ends, poisoned context sticks around, referenced again across reasoning steps, and sometimes across sessions.

Context poisoning gets conflated with a few related failure modes. Here's how to tell them apart:

- **Data poisoning** is a training-time attack: an adversary corrupts the dataset before a model is deployed. Context poisoning happens at inference and leaves the model's weights untouched.

- **Prompt injection** is a runtime attack: an attacker slips malicious instructions into a prompt to override system rules, exfiltrate data, or trigger unintended tool calls. It becomes context poisoning only when the output gets cached, retrieved, or written to memory.

- **Memory poisoning** targets an agent's long-term store. It can be adversarial or accidental, and whether it sticks across sessions depends on how the system manages that memory.

These failures often overlap in production, but the controls you reach for are different.

## **Accidental & adversarial poisoning in agent systems**

Poisoned context shows up two ways. Sometimes it's accidental, the result of a stale cache or a sloppy retriever pulling in the wrong document. Sometimes it's adversarial, with an attacker deliberately seeding bad content into a source the agent will retrieve. Both exploit the same structural weakness: LLMs don't cleanly separate instructions from external data, so anything pulled into the context window can be read as a command.

### **Accidental poisoning in production**

Most poisoning in production isn't an attack. It's ordinary engineering problems with outsized consequences. Three patterns come up again and again:

- **Stale cached data:** A retrieval-augmented generation (RAG) system pulls pre-update content from a cache without time-to-live (TTL) validation and serves it as authoritative context. The agent has no way to know the pricing page was updated an hour ago, or that the product spec changed last Tuesday. The answer looks confident. It's also wrong.

- **Semantic drift in retrieval:** Vector search ranks documents by embedding proximity, but proximity isn't the same as topical precision, and it isn't an access boundary. Picture an HR agent answering a routine policy question that pulls in a document containing a specific employee's personally identifiable information (PII) along the way, simply because the embedding sat close enough to the query. No adversarial prompt, just an over-eager retriever.

- **Accumulated reasoning errors:** These build up over multi-turn interactions. Earlier flawed conclusions stay in the context window and anchor every reasoning step that follows. Over time, [accumulated context](https://arxiv.org/html/2605.05846v1) can quietly steer the agent off course, accidental self-poisoning through error propagation.

<!-- CTA block omitted -->

None of these require a bad actor. They require only that the data pipeline feeding the agent be slightly out of date, slightly too permissive, or slightly too long-running.

### **Adversarial poisoning in production**

Adversarial poisoning exploits the same weaknesses, just on purpose. The patterns look similar, but the intent and the payloads change the picture:

- **Indirect prompt injection:** Attackers embed malicious instructions inside external content that gets retrieved into the agent's context during normal operation. LLM web agents [mix trusted and untrusted](https://arxiv.org/html/2604.27202v1) data, and they can act on it directly by navigating pages, calling tools, and making decisions. That means malicious content can do more than bias an answer. Security researchers have documented indirect prompt injection payloads designed to bypass AI-based content review systems, a sign that attackers are reaching for increasingly sophisticated techniques.

- **RAG knowledge base poisoning:** Attackers inject adversarially crafted text into the documents a RAG system retrieves from, so that targeted queries pull poisoned passages into the LLM's context. In one study, the PoisonedRAG attack [reported 90% success](https://arxiv.org/html/2604.23338v1) when injecting five malicious texts per target question into a knowledge database with millions of entries, without touching the underlying model.

- **Tool & Model Context Protocol (MCP) poisoning:** Instructions hidden inside tool descriptions and metadata. The user never sees the malicious instruction; the documentation [is sent directly](https://thenewstack.io/building-with-mcp-mind-the-security-gaps/) to the LLM the moment a connection is made.

The common thread across both categories is that the agent can't reliably tell the difference between data and instructions once something is inside the context window. That's why the fix has to happen further upstream, in what gets retrieved, what gets cached, and what gets remembered.

## **How poisoned context propagates through an agent's reasoning loop**

One bad input doesn't stay one bad input. Agents loop, planning, acting, reflecting, improving, and the output of each pass becomes the input of the next. That architecture is what turns a single corrupt datum into a system-wide failure. Bad context gets in, then it gets reused, amplified, and written back into places future reasoning will draw from. Four propagation patterns show up most often.

### **Tool output cascades**

A poisoned tool output doesn't just affect the step that produced it. It feeds the next reasoning step, which feeds the next tool call, and so on. In complex tool dependency graphs, [malicious instructions propagate](https://arxiv.org/html/2603.07496v2) across tool outputs rather than getting contained at the source. Documented failure modes include production agents that have [deleted critical data](https://arxiv.org/html/2603.06847v1) and made tool calls with hallucinated parameters.

### **Chain-of-thought corruption**

Poisoned premises injected into the chain-of-thought trace cause every reasoning step that follows to build on a corrupted logical foundation, while still looking superficially coherent. And you can't always trust the trace to tell you what went wrong. An [LLM's stated reasoning](https://arxiv.org/html/2505.18889v4) doesn't accurately reflect its true decision-making process. It functions more like post-hoc rationalization. Prompt hardening, the practice of adding defensive instructions to a system prompt to make the model refuse or ignore poisoned input, doesn't always hold up either. In one study, [prompt hardening failed](https://arxiv.org/html/2605.09822v1): models acknowledged the defensive instructions in their chain-of-thought and then accepted the poisoned data anyway. When prompt-level defenses show limits like this, the practical response is to push controls further upstream in the data pipeline.

### **Memory contamination across sessions**

This is where propagation stops being a single-session problem. Memory poisoning writes corrupted information into persistent stores, so the bad data outlives the conversation that produced it. In evolving memory systems, [errors are cumulative](https://arxiv.org/html/2603.11768v1), creating a compounding failure loop across three interfaces: input ingestion, memory write, and future retrieval. Bad writes tend to produce [three drift patterns](https://arxiv.org/html/2603.11768v1): semantic drift through repeated summarization, procedural drift through reinforcement of suboptimal workflows, and hallucination internalization, where the agent eventually treats hallucinated content as validated knowledge.

### **Multi-agent divergence**

The picture gets worse in multi-agent systems. Corrupted outputs from one agent become inputs to peer agents, and coordination breaks down fast: [world states diverge](https://arxiv.org/html/2512.08296v1), and errors cascade rather than cancel each other out. Corrupted beliefs end up replicated into structurally distinct copies across agents, and once that happens, the agents can no longer correct each other by cross-reference.

Across all four patterns, the propagation mechanics share a common shape: a small input failure gets amplified by the loop, the memory, or the coordination layer until it becomes the dominant signal. Underneath every one of these patterns is the same thing: a data pipeline that lets stale, unverified, or weakly governed content reach the agent. That's where the practical fixes live.

<!-- CTA block omitted -->

## **How Redis Iris helps keep agent context fresh & structured**

[Redis](https://redis.io/) is the real-time context engine for AI: the layer that searches, gathers, and serves the right data at the right time so agents and LLMs can act with relevance and reliability. [Redis Iris](https://redis.io/iris/) packages that context engine into a single product that sits between an agent and the data it needs, feeding it fresh, navigable context instead of stale or fragmented inputs.

### **Keeping data fresh at the source**

Stale cached data is a common accidental poisoning vector: the underlying source updates, but the agent's cache doesn't, so the agent answers from an old snapshot. The fix is to make the cache follow the source automatically. [Redis Data Integration (RDI)](https://redis.io/data-integration/) does that with Change Data Capture (CDC), streaming updates from systems of record like Postgres, MySQL, or Oracle into Redis as they happen. When the row changes upstream, the agent's view of it changes too, without cron jobs, batch exports, or hand-written sync code.

### **Schema-first retrieval**

Semantic drift and MCP poisoning share a root cause: agents pull from sources without governance, so anything in the retrievable set can land in the context window. The fix is to put a controlled interface between the agent and the underlying data. [Redis Context Retriever](https://redis.io/context-retriever/) (in preview) lets developers define a semantic model of business entities and access rules, then auto-generates MCP tools that expose only those entities to the agent. Agents authenticate with scoped keys, can only call tools they're permitted to use, and hit row-level filters enforced server-side. That cuts off both over-eager retrieval and malicious tool descriptions before they reach the model.

### **Governed memory across sessions**

Memory contamination is the hardest propagation pattern to clean up, because bad writes outlive the conversation that produced them. The fix is a memory store you can actually see into and reach into. [Redis Agent Memory](https://redis.io/agent-memory/) (in preview) keeps short-term session state and long-term durable memory in a single governed store, so memory writes can be scoped, inspected, and invalidated instead of accumulating silently across sessions. For agents whose memory histories grow large enough to make pure in-memory storage expensive, [Redis Flex](https://redis.io/flex/) tiers older data to SSD while keeping hot state in RAM.

<!-- CTA block omitted -->

## **Context poisoning is an infrastructure problem, not just a model problem**

Context poisoning starts as a data and system-design problem before it becomes a reasoning problem. Failures in how context is curated and refreshed can degrade output regardless of the model's underlying capability. Even against state-of-the-art defenses, adaptive attacks can [still find ways through](https://arxiv.org/html/2601.17548v1), so a prompt-layer-only defense isn't enough on its own.

Better models alone won't fix stale data in a cache, make a tool response trustworthy, or stop a poisoned memory from surfacing in a future session. The most practical defenses sit in the infrastructure that stores, refreshes, retrieves, and reuses data, which is where Redis fits: fresh data retrieval, structured access, and governed agent memory in one place for agent systems that need fast, reliable context.

[Try Redis Iris free](https://redis.io/try-free/?rcplan=iris), or [talk to our team](https://redis.io/meeting/) about your agent infrastructure.