{
  "id": "langcache",
  "title": "Semantic caching with LangCache on Redis Cloud",
  "url": "https://redis.io/docs/latest/operate/rc/context-engine/langcache/",
  "summary": "Store LLM responses for AI applications in Redis Cloud.",
  "tags": [
    "docs",
    "operate",
    "rc"
  ],
  "last_updated": "2026-05-18T05:44:54-07:00",
  "children": [
    {
      "id": "create-service",
      "summary": "",
      "title": "Create a LangCache service",
      "url": "https://redis.io/docs/latest/operate/rc/context-engine/langcache/create-service/"
    },
    {
      "id": "use-langcache",
      "summary": "",
      "title": "Use the LangCache API on Redis Cloud",
      "url": "https://redis.io/docs/latest/operate/rc/context-engine/langcache/use-langcache/"
    },
    {
      "id": "view-edit-cache",
      "summary": "",
      "title": "View and edit LangCache service",
      "url": "https://redis.io/docs/latest/operate/rc/context-engine/langcache/view-edit-cache/"
    },
    {
      "id": "monitor-cache",
      "summary": "",
      "title": "Monitor a LangCache service",
      "url": "https://redis.io/docs/latest/operate/rc/context-engine/langcache/monitor-cache/"
    }
  ],
  "page_type": "content",
  "content_hash": "ecbe0597a18fd02ae58da270000d43f49fc1717a2c7fd8d2b304261ec2c85584",
  "sections": [
    {
      "id": "overview",
      "title": "Overview",
      "role": "overview",
      "text": "LangCache is a semantic caching service available as a REST API that stores LLM responses for fast and cheaper retrieval, built on the Redis vector database. By using semantic caching, you can significantly reduce API costs and lower the average latency of your generative AI applications.\n\nFor more information about how LangCache works, see the [LangCache overview](https://redis.io/docs/latest/develop/ai/context-engine/langcache)."
    },
    {
      "id": "llm-cost-reduction-with-langcache",
      "title": "LLM cost reduction with LangCache",
      "role": "content",
      "text": "LangCache reduces your LLM costs by caching responses and avoiding repeated API calls. When a response is served from cache, you don’t pay for output tokens. Input token costs are typically offset by embedding and storage costs.\n\nFor every cached response, you'll save the output token cost. To calculate your monthly savings with LangCache, you can use the following formula:\n\n[code example]\n\nThe more requests you serve from LangCache, the more you save, because you’re not paying to regenerate the output.\n\nHere’s an example:\n- Monthly LLM spend: $200\n- Percentage of output tokens in your spend: 60%\n- Cost of output tokens: $200 × 60% = $120\n- Cache hit rate: 50%\n- Estimated savings: $120 × 50% = $60/month\n\n\nThe formula and numbers above provide a rough estimate of your monthly savings. Actual savings will vary depending on your usage.\n\n\nYou can also use the [LangCache savings calculator](https://redis.io/calculator/langcache/) to estimate your annual savings with LangCache."
    },
    {
      "id": "get-started-with-langcache-on-redis-cloud",
      "title": "Get started with LangCache on Redis Cloud",
      "role": "content",
      "text": "To set up LangCache on Redis Cloud:\n\n1. [Create a database](https://redis.io/docs/latest/operate/rc/databases/create-database) on Redis Cloud.\n2. [Create a LangCache service](https://redis.io/docs/latest/operate/rc/context-engine/langcache/create-service) for your database on Redis Cloud.\n3. [Use the LangCache API](https://redis.io/docs/latest/operate/rc/context-engine/langcache/use-langcache) from your client app.\n\nAfter you set up LangCache, you can [view and edit the cache](https://redis.io/docs/latest/operate/rc/context-engine/langcache/view-edit-cache) and [monitor the cache's performance](https://redis.io/docs/latest/operate/rc/context-engine/langcache/monitor-cache).\n\nSee also our [Redis LangCache setup](https://www.youtube.com/watch?v=UOGhMZlZLko)\ntutorial video for advice on how to get started."
    }
  ],
  "examples": [
    {
      "id": "llm-cost-reduction-with-langcache-ex0",
      "language": "bash",
      "code": "Est. monthly savings with LangCache = \n    (Monthly output token costs) × (Cache hit rate)",
      "section_id": "llm-cost-reduction-with-langcache"
    }
  ]
}