Technology

Stay ahead of AI agent infrastructure

Build faster, more reliable GenAI apps for less

If you’ve ever wondered how companies are building AI agents and agentic systems—look no further. Here, we’re breaking down the ins and outs of AI agents and how you can use them for your apps.

How agents work

01Set the goal

AI agent orchestration

AI agents get instructions from human users or other programs. After setting the goal, they determine which steps to take (like calling tools or databases). Tools like LangGraph or AutoGen can be used to create and call agents which abstract lower-level details. Or you can write your own code to set up agents and connect them with the data connections and tools needed to accomplish tasks.

See our integrations

AI agent orchestration

02Take action

AI models

Agents often call multiple models depending on the task. To optimize ‌speed and cost, you can call smaller, faster models for simple tasks and use more advanced models when necessary.

Semantic caching

To make responses faster and save cost from AI inferencing, AI apps and agents can use semantic caching to store the results from the LLM for easy access. This helps for use cases with redundant calls, like for customer support agents where many users ask similar questions like “How do I reset my password?”

Tool calls

Agents can interact with multiple tools and decide which tool is best for a particular task. They can search the internet, call other internal tools, or write queries to search databases for specific info.

Tool calls

Semantic caching

AI model

03Acquire info

Agent memory (short-term and long-term)

While completing tasks, agents store short-term information for the duration of the tasks (like user input and results of tool calls), so it’s available for fast retrieval and can be leveraged by future steps. Long-term memory stores persistent information that can be retained and reused across multiple tasks, sessions, or interactions. This memory accumulates and retains knowledge over time. This helps maintain a coherent understanding of the user’s preferences, past queries, or evolving objectives across sessions.

Data sources

To interact with existing information, AI agents connect to one or more databases to get the info they need to make decisions and provide accurate responses. Like any other app, agents do this through APIs. They can be trained to interact intelligently with APIs to get the data required, which can include generating queries. Redis does this well with Redis Data Integration.

Learn about RDI

Embedding models

A common technique to identify relevant info is Retrieval Augmented Generation or RAG. For RAG, structured and unstructured data is converted to a vector embedding that captures the semantic meaning of that data to return to the agent.

Vector database

Vector embeddings of available knowledge bases or context are stored in databases that support vectors and vector search, which many databases have recently added support for because of their usefulness for GenAI.

See our benchmarks

Agent memory (short-term and long-term)

Data sources

Embedding model

Vector database

04Scale with infrastructure

Hosting & AI chips

AI apps can be deployed into production using all the major cloud providers, on-prem solutions, and hybrid solutions. They use hardware like GPUs, TPUs, and CPUs to process tasks and meet their demand for computing capacity.

See our cloud partners

LLMOps, authorization, & dev tools

After execution, there are supporting frameworks and platforms you’ll need to make sure your data flows properly and agents can be fixed if something goes wrong. There are also agent builder tools that let you design and build AI agents with little or no code. See demos and resources here.

Hosting

AI chips

llmops

Auth

Dev tools

05Optimize and iterate

Consider latency

AI agents serve various roles, from scrubbing and annotating data to answering real-time questions for users. For agents that interact with humans or streaming data, the components need to be fast. For background or asynchronous processes, slower responses may work. But as agentic systems get more complex, latency adds up, and many designs try to optimize for real-time from the initial design rather than trying to reduce latency later.

Plan for future innovations

Innovations are constant in GenAI, and you want tools with ecosystem support. To adapt to the new tools, frameworks, and models constantly coming out, build your agent architecture in a way that takes advantage of the latest models, or add in new data sources or tools like logging or visibility.

Build for scale

You may want to build a prototype to establish proof of concept. As you move your prototype to production, make sure your agentic systems can handle messy production data, work for many users at the same time without slow-downs, and stand up to security considerations like authorization protocols and DDoS attacks.

See our customers

The basics

What are AI agents?

These are individual actors that complete a specific task using GenAI. They can decide which steps to take, call other tools, and evaluate intermediary steps before sending back their output.

What are agentic systems?

These are built from multiple components or agents working together. They’re different from doing an entire end-to-end workflow by calling a single AI model once. By combining several agents or components, you get the benefits of GenAI with more control and better output.

How do they work?

Agents get instructions from a user or program to complete a task. They can then choose between multiple tools to decide the next step and evaluate the result before returning a high-quality response. They use multiple data sources to get the necessary information required to complete the request.

Why should I care?

Agents allow tasks to be broken down into their component pieces. Large, complex models have been impressive across a range of tests, but they still get important things wrong and sometimes create hallucinations. Some tasks are easier to improve by designing a better system with multiple components, so companies are turning to AI agents.

Check out these resources

Webinar

Agentic RAG

See the architecture and components Llamalndex used to build an agentic RAG system for customer support. Code snippets and notebook included.

Watch the webinar

Tutorial

Building an AI agent

Get a notebook with the code and instructions needed to build your own Al agents that take various tasks, call tools, and respond to user requests.

Read the tutorial

Docs

Integrate Redis into Al agents

See how you can use Redis in your current workflow to accelerate your Al agents.

See our docs

Get started with Redis today

Speak to a Redis expert and learn more about enterprise-grade Redis today.

Try for free Talk to sales