LangCache reduces your LLM costs by caching responses and avoiding repeated API calls. When a response is served from cache, you don’t pay for output tokens. Input token costs are typically offset by embedding and storage costs.

For every cached response, you'll save the output token cost. To calculate your monthly savings with LangCache, you can use the following formula:

Est. monthly savings with LangCache = 
    (Monthly output token costs) × (Cache hit rate)

The more requests you serve from LangCache, the more you save, because you’re not paying to regenerate the output.

Here’s an example:

Note:
The formula and numbers above provide a rough estimate of your monthly savings. Actual savings will vary depending on your usage.