Stay ahead of AI agent infrastructure
If you’ve ever wondered how companies are building AI agents and agentic systems—look no further. Here, we’re breaking down the ins and outs of AI agents and how you can use them for your apps.
Fall releases are live for Redis for AI, Redis Cloud, & more.
If you’ve ever wondered how companies are building AI agents and agentic systems—look no further. Here, we’re breaking down the ins and outs of AI agents and how you can use them for your apps.
AI agent orchestration
AI agents get instructions from human users or other programs. After setting the goal, they determine which steps to take (like calling tools or databases). Tools like LangGraph or AutoGen can be used to create and call agents which abstract lower-level details. Or you can write your own code to set up agents and connect them with the data connections and tools needed to accomplish tasks.
AI models
Agents often call multiple models depending on the task. To optimize speed and cost, you can call smaller, faster models for simple tasks and use more advanced models when necessary.
Semantic caching
To make responses faster and save cost from AI inferencing, AI apps and agents can use semantic caching to store the results from the LLM for easy access. This helps for use cases with redundant calls, like for customer support agents where many users ask similar questions like “How do I reset my password?”
Tool calls
Agents can interact with multiple tools and decide which tool is best for a particular task. They can search the internet, call other internal tools, or write queries to search databases for specific info.
Agent memory (short-term and long-term)
While completing tasks, agents store short-term information for the duration of the tasks (like user input and results of tool calls), so it’s available for fast retrieval and can be leveraged by future steps. Long-term memory stores persistent information that can be retained and reused across multiple tasks, sessions, or interactions. This memory accumulates and retains knowledge over time. This helps maintain a coherent understanding of the user’s preferences, past queries, or evolving objectives across sessions.
Data sources
To interact with existing information, AI agents connect to one or more databases to get the info they need to make decisions and provide accurate responses. Like any other app, agents do this through APIs. They can be trained to interact intelligently with APIs to get the data required, which can include generating queries. Redis does this well with Redis Data Integration.
Embedding models
A common technique to identify relevant info is Retrieval Augmented Generation or RAG. For RAG, structured and unstructured data is converted to a vector embedding that captures the semantic meaning of that data to return to the agent.
Vector database
Vector embeddings of available knowledge bases or context are stored in databases that support vectors and vector search, which many databases have recently added support for because of their usefulness for GenAI.
Hosting & AI chips
AI apps can be deployed into production using all the major cloud providers, on-prem solutions, and hybrid solutions. They use hardware like GPUs, TPUs, and CPUs to process tasks and meet their demand for computing capacity.
LLMOps, authorization, & dev tools
After execution, there are supporting frameworks and platforms you’ll need to make sure your data flows properly and agents can be fixed if something goes wrong. There are also agent builder tools that let you design and build AI agents with little or no code. See demos and resources here.
Consider latency
AI agents serve various roles, from scrubbing and annotating data to answering real-time questions for users. For agents that interact with humans or streaming data, the components need to be fast. For background or asynchronous processes, slower responses may work. But as agentic systems get more complex, latency adds up, and many designs try to optimize for real-time from the initial design rather than trying to reduce latency later.
Plan for future innovations
Innovations are constant in GenAI, and you want tools with ecosystem support. To adapt to the new tools, frameworks, and models constantly coming out, build your agent architecture in a way that takes advantage of the latest models, or add in new data sources or tools like logging or visibility.
Build for scale
You may want to build a prototype to establish proof of concept. As you move your prototype to production, make sure your agentic systems can handle messy production data, work for many users at the same time without slow-downs, and stand up to security considerations like authorization protocols and DDoS attacks.
These are individual actors that complete a specific task using GenAI. They can decide which steps to take, call other tools, and evaluate intermediary steps before sending back their output.
These are built from multiple components or agents working together. They’re different from doing an entire end-to-end workflow by calling a single AI model once. By combining several agents or components, you get the benefits of GenAI with more control and better output.
Agents get instructions from a user or program to complete a task. They can then choose between multiple tools to decide the next step and evaluate the result before returning a high-quality response. They use multiple data sources to get the necessary information required to complete the request.
Agents allow tasks to be broken down into their component pieces. Large, complex models have been impressive across a range of tests, but they still get important things wrong and sometimes create hallucinations. Some tasks are easier to improve by designing a better system with multiple components, so companies are turning to AI agents.
See the architecture and components Llamalndex used to build an agentic RAG system for customer support. Code snippets and notebook included.
Speak to a Redis expert and learn more about enterprise-grade Redis today.