{
  "id": "message_history",
  "title": "Manage LLM Message History",
  "url": "https://redis.io/docs/latest/develop/ai/redisvl/0.16.0/user_guide/message_history/",
  "summary": "",
  "content": "\n\nLarge Language Models are inherently stateless with no knowledge of previous interactions. This becomes a challenge when engaging in long conversations that rely on context. The solution is to store and retrieve conversation history with each LLM call.\n\nThis guide demonstrates how to use Redis to structure, store, and retrieve conversational message history.\n\n## Prerequisites\n\nBefore you begin, ensure you have:\n- Installed RedisVL: `pip install redisvl`\n- A running Redis instance ([Redis 8+](https://redis.io/downloads/) or [Redis Cloud](https://redis.io/cloud))\n\n## What You'll Learn\n\nBy the end of this guide, you will be able to:\n- Store and retrieve conversation messages with `MessageHistory`\n- Manage multiple users and conversations with session tags\n- Use `SemanticMessageHistory` for relevance-based context retrieval\n- Prune incorrect or unwanted messages from conversation history\n\n\n```python\nfrom redisvl.extensions.message_history import MessageHistory\n\nchat_history = MessageHistory(name='student tutor')\n```\n\nTo align with common LLM APIs, Redis stores messages with `role` and `content` fields.\nThe supported roles are \"system\", \"user\" and \"llm\".\n\nYou can store messages one at a time or all at once.\n\n\n```python\nchat_history.add_message({\"role\":\"system\", \"content\":\"You are a helpful geography tutor, giving simple and short answers to questions about European countries.\"})\nchat_history.add_messages([\n    {\"role\":\"user\", \"content\":\"What is the capital of France?\"},\n    {\"role\":\"llm\", \"content\":\"The capital is Paris.\"},\n    {\"role\":\"user\", \"content\":\"And what is the capital of Spain?\"},\n    {\"role\":\"llm\", \"content\":\"The capital is Madrid.\"},\n    {\"role\":\"user\", \"content\":\"What is the population of Great Britain?\"},\n    {\"role\":\"llm\", \"content\":\"As of 2023 the population of Great Britain is approximately 67 million people.\"},]\n    )\n```\n\nAt any point we can retrieve the recent history of the conversation. It will be ordered by entry time.\n\n\n```python\ncontext = chat_history.get_recent()\nfor message in context:\n    print(message)\n```\n\n    {'role': 'llm', 'content': 'The capital is Paris.'}\n    {'role': 'user', 'content': 'And what is the capital of Spain?'}\n    {'role': 'llm', 'content': 'The capital is Madrid.'}\n    {'role': 'user', 'content': 'What is the population of Great Britain?'}\n    {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}\n\n\nIn many LLM flows, conversations progress through a series of prompt and response pairs. MessageHistory provides a `store()` convenience function to add these efficiently.\n\n\n```python\nprompt = \"what is the size of England compared to Portugal?\"\nresponse = \"England is larger in land area than Portal by about 15000 square miles.\"\nchat_history.store(prompt, response)\n\ncontext = chat_history.get_recent(top_k=6)\nfor message in context:\n    print(message)\n```\n\n    {'role': 'user', 'content': 'And what is the capital of Spain?'}\n    {'role': 'llm', 'content': 'The capital is Madrid.'}\n    {'role': 'user', 'content': 'What is the population of Great Britain?'}\n    {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}\n    {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}\n    {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}\n\n\n## Managing multiple users and conversations\n\nFor applications that need to handle multiple conversations concurrently, Redis supports tagging messages to keep conversations separated.\n\n\n```python\nchat_history.add_message({\"role\":\"system\", \"content\":\"You are a helpful algebra tutor, giving simple answers to math problems.\"}, session_tag='student two')\nchat_history.add_messages([\n    {\"role\":\"user\", \"content\":\"What is the value of x in the equation 2x + 3 = 7?\"},\n    {\"role\":\"llm\", \"content\":\"The value of x is 2.\"},\n    {\"role\":\"user\", \"content\":\"What is the value of y in the equation 3y - 5 = 7?\"},\n    {\"role\":\"llm\", \"content\":\"The value of y is 4.\"}],\n    session_tag='student two'\n    )\n\nfor math_message in chat_history.get_recent(session_tag='student two'):\n    print(math_message)\n```\n\n    {'role': 'system', 'content': 'You are a helpful algebra tutor, giving simple answers to math problems.'}\n    {'role': 'user', 'content': 'What is the value of x in the equation 2x + 3 = 7?'}\n    {'role': 'llm', 'content': 'The value of x is 2.'}\n    {'role': 'user', 'content': 'What is the value of y in the equation 3y - 5 = 7?'}\n    {'role': 'llm', 'content': 'The value of y is 4.'}\n\n\n## Semantic message history\nFor longer conversations our list of messages keeps growing. Since LLMs are stateless we have to continue to pass this conversation history on each subsequent call to ensure the LLM has the correct context.\n\nA typical flow looks like this:\n```\nwhile True:\n    prompt = input('enter your next question')\n    context = chat_history.get_recent()\n    response = LLM_api_call(prompt=prompt, context=context)\n    chat_history.store(prompt, response)\n```\n\nThis works, but as context keeps growing so too does our LLM token count, which increases latency and cost.\n\nConversation histories can be truncated, but that can lead to losing relevant information that appeared early on.\n\nA better solution is to pass only the relevant conversational context on each subsequent call.\n\nFor this, RedisVL has the `SemanticMessageHistory`, which uses vector similarity search to return only semantically relevant sections of the conversation.\n\n\n```python\nfrom redisvl.extensions.message_history import SemanticMessageHistory\nsemantic_history = SemanticMessageHistory(name='tutor')\n\nsemantic_history.add_messages(chat_history.get_recent(top_k=8))\n```\n\n\n```python\nprompt = \"what have I learned about the size of England?\"\nsemantic_history.set_distance_threshold(0.35)\ncontext = semantic_history.get_relevant(prompt)\nfor message in context:\n    print(message)\n```\n\n    {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}\n\n\nYou can adjust the degree of semantic similarity needed to be included in your context.\n\nSetting a distance threshold close to 0.0 will require an exact semantic match, while a distance threshold of 2.0 will include everything (Redis COSINE distance range is [0-2]).\n\n\n```python\nsemantic_history.set_distance_threshold(0.7)\n\nlarger_context = semantic_history.get_relevant(prompt)\nfor message in larger_context:\n    print(message)\n```\n\n    {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}\n    {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}\n    {'role': 'user', 'content': 'What is the population of Great Britain?'}\n    {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}\n    {'role': 'user', 'content': 'And what is the capital of Spain?'}\n\n\n## Conversation control\n\nLLMs can hallucinate on occasion and when this happens it can be useful to prune incorrect information from conversational histories so this incorrect information doesn't continue to be passed as context.\n\n\n```python\nsemantic_history.store(\n    prompt=\"what is the smallest country in Europe?\",\n    response=\"Monaco is the smallest country in Europe at 0.78 square miles.\" # Incorrect. Vatican City is the smallest country in Europe\n)\n\n# get the key of the incorrect message\ncontext = semantic_history.get_recent(top_k=1, raw=True)\nbad_key = context[0]['entry_id']\nsemantic_history.drop(bad_key)\n\ncorrected_context = semantic_history.get_recent()\nfor message in corrected_context:\n    print(message)\n```\n\n    {'role': 'user', 'content': 'What is the population of Great Britain?'}\n    {'role': 'llm', 'content': 'As of 2023 the population of Great Britain is approximately 67 million people.'}\n    {'role': 'user', 'content': 'what is the size of England compared to Portugal?'}\n    {'role': 'llm', 'content': 'England is larger in land area than Portal by about 15000 square miles.'}\n    {'role': 'user', 'content': 'what is the smallest country in Europe?'}\n\n\n## Retrieving message counts\n\nTo get the total number of  messages stored in a session, use the `.count()` method.  \nYou can optionally pass a `session_tag` argument to retrieve the count for a different conversation session.\n\n\n```python\nprint(f\"Total messages in the session: {chat_history.count()}\")\n```\n\n    Total messages in the session: 7\n\n\n## Next Steps\n\nNow that you understand message history management, explore these related guides:\n\n- [Cache LLM Responses](03_llmcache.ipynb) - Reduce API costs with semantic caching\n- [Route Queries with SemanticRouter](08_semantic_router.ipynb) - Classify user queries to routes\n- [Create Embeddings with Vectorizers](04_vectorizers.ipynb) - Use different embedding providers\n\n## Cleanup\n\n\n```python\nchat_history.clear()\nsemantic_history.clear()\n```\n",
  "tags": [],
  "last_updated": "2026-04-21T14:39:33+02:00"
}
