Getting Started

RedisVL is a Python library with an integrated CLI for building AI applications with Redis. This guide covers the core workflow:

  1. Defining an IndexSchema
  2. Preparing a sample dataset
  3. Creating a SearchIndex
  4. Using the rvl CLI
  5. Loading data into Redis
  6. Fetching and managing records
  7. Executing vector searches
  8. Updating an index

Prerequisites

Before you begin, ensure you have:

What You'll Learn

By the end of this guide, you will be able to:

  • Create index schemas using Python dictionaries or YAML files
  • Build and manage SearchIndex objects
  • Use the rvl CLI for index management
  • Load data and execute vector similarity searches
  • Fetch individual records and list all keys in an index
  • Delete specific records by key or document ID
  • Update index schemas as your application evolves

Define an IndexSchema

The IndexSchema maintains crucial index configuration and field definitions to enable search with Redis. For ease of use, the schema can be constructed from a python dictionary or yaml file.

Example Schema Creation

Consider a dataset with user information, including job, age, credit_score, and a 3-dimensional user_embedding vector.

You must also decide on a Redis index name and key prefix to use for this dataset. Below are example schema definitions in both YAML and Dict format.

YAML Definition:

version: '0.1.0'

index:
  name: user_simple
  prefix: user_simple_docs

fields:
    - name: user
      type: tag
    - name: credit_score
      type: tag
    - name: job
      type: text
    - name: age
      type: numeric
    - name: user_embedding
      type: vector
      attrs:
        algorithm: flat
        dims: 3
        distance_metric: cosine
        datatype: float32

Store this in a local file, such as schema.yaml, for RedisVL usage.

Python Dictionary:

schema = {
    "index": {
        "name": "user_simple",
        "prefix": "user_simple_docs",
    },
    "fields": [
        {"name": "user", "type": "tag"},
        {"name": "credit_score", "type": "tag"},
        {"name": "job", "type": "text"},
        {"name": "age", "type": "numeric"},
        {
            "name": "user_embedding",
            "type": "vector",
            "attrs": {
                "dims": 3,
                "distance_metric": "cosine",
                "algorithm": "flat",
                "datatype": "float32"
            }
        }
    ]
}

Sample Dataset Preparation

Below, create a mock dataset with user, job, age, credit_score, and user_embedding fields. The user_embedding vectors are synthetic examples for demonstration purposes.

For more information on creating real-world embeddings, refer to this article.

import numpy as np


data = [
    {
        'user': 'john',
        'age': 1,
        'job': 'engineer',
        'credit_score': 'high',
        'user_embedding': np.array([0.1, 0.1, 0.5], dtype=np.float32).tobytes()
    },
    {
        'user': 'mary',
        'age': 2,
        'job': 'doctor',
        'credit_score': 'low',
        'user_embedding': np.array([0.1, 0.1, 0.5], dtype=np.float32).tobytes()
    },
    {
        'user': 'joe',
        'age': 3,
        'job': 'dentist',
        'credit_score': 'medium',
        'user_embedding': np.array([0.9, 0.9, 0.1], dtype=np.float32).tobytes()
    }
]

The user_embedding vectors are converted to bytes using NumPy's .tobytes() method.

Create a SearchIndex

With the schema and sample dataset ready, create a SearchIndex.

Bring your own Redis connection instance

This is ideal in scenarios where you have custom settings on the connection instance or if your application will share a connection pool:

from redisvl.index import SearchIndex
from redis import Redis

client = Redis.from_url("redis://localhost:6379")
index = SearchIndex.from_dict(schema, redis_client=client, validate_on_load=True)

Let the index manage the connection instance

This is ideal for simple cases:

index = SearchIndex.from_dict(schema, redis_url="redis://localhost:6379", validate_on_load=True)

# If you don't specify a client or Redis URL, the index will attempt to
# connect to Redis at the default address "redis://localhost:6379".

Create the index

Now that we are connected to Redis, we need to run the create command.

index.create(overwrite=True)

Note that at this point, the index has no entries. Data loading follows.

Inspect with the rvl CLI

Use the rvl CLI to inspect the created index and its fields:

!rvl index info -i user_simple
Index Information:
╭──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────╮
│ Index Name           │ Storage Type         │ Prefixes             │ Index Options        │ Indexing             │
├──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤
| user_simple          | HASH                 | ['user_simple_docs'] | []                   | 0                    |
╰──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────╯
Index Fields:
╭─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────┬─────────────────╮
│ Name            │ Attribute       │ Type            │ Field Option    │ Option Value    │ Field Option    │ Option Value    │ Field Option    │ Option Value    │ Field Option    │ Option Value    │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ user            │ user            │ TAG             │ SEPARATOR       │ ,               │                 │                 │                 │                 │                 │                 │
│ credit_score    │ credit_score    │ TAG             │ SEPARATOR       │ ,               │                 │                 │                 │                 │                 │                 │
│ job             │ job             │ TEXT            │ WEIGHT          │ 1               │                 │                 │                 │                 │                 │                 │
│ age             │ age             │ NUMERIC         │                 │                 │                 │                 │                 │                 │                 │                 │
│ user_embedding  │ user_embedding  │ VECTOR          │ algorithm       │ FLAT            │ data_type       │ FLOAT32         │ dim             │ 3               │ distance_metric │ COSINE          │
╰─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────┴─────────────────╯

Load Data to SearchIndex

Load the sample dataset to Redis.

Validate data entries on load

RedisVL uses pydantic validation under the hood to ensure loaded data is valid and confirms to your schema. This setting is optional and can be configured in the SearchIndex class.

keys = index.load(data)

print(keys)
['user_simple_docs:01KHKHQYX95EDQN18FG8FRMRQ5', 'user_simple_docs:01KHKHQYXC97WY4ACG1V01GEPC', 'user_simple_docs:01KHKHQYXC97WY4ACG1V01GEPD']

By default, load will create a unique Redis key as a combination of the index key prefix and a random ULID. You can also customize the key by providing direct keys or pointing to a specified id_field on load.

Load INVALID data

This will raise a SchemaValidationError if validate_on_load is set to true in the SearchIndex class.

# NBVAL_SKIP

try:
    keys = index.load([{"user_embedding": True}])
except Exception as e:
    print(str(e))
Schema validation failed for object at index 0. Field 'user_embedding' expects bytes (vector data), but got boolean value 'True'. If this should be a vector field, provide a list of numbers or bytes. If this should be a different field type, check your schema definition.
Object data: {
  "user_embedding": true
}
Hint: Check that your data types match the schema field definitions. Use index.schema.fields to view expected field types.

Upsert the index with new data

Upsert data by using the load method again:

# Add more data
new_data = [{
    'user': 'tyler',
    'age': 9,
    'job': 'engineer',
    'credit_score': 'high',
    'user_embedding': np.array([0.1, 0.3, 0.5], dtype=np.float32).tobytes()
}]
keys = index.load(new_data)

print(keys)
['user_simple_docs:01KHKHR37CD6143DNQ41G3ADNA']

Fetch and Manage Records

RedisVL provides several methods to retrieve and manage individual records in your index.

Fetch a record by ID

Use fetch() to retrieve a single record when you know its ID. The ID is the unique identifier you provided during load (via id_field) or the auto-generated ULID.

# Fetch a record by its ID (e.g., the user field value if used as id_field)
# First, let's reload data with a specific id_field
index.load(data, id_field="user")

# Now fetch by the user ID
record = index.fetch("john")
print(record)

You can also construct the full Redis key from an ID using the key() method:

# Get the full Redis key for a given ID
full_key = index.key("john")
print(f"Full Redis key: {full_key}")
vector_distanceuseragejobcredit_score
0john1engineerhigh
0mary2doctorlow
0.0566298961639tyler9engineerhigh

List all keys in the index

To enumerate all keys in your index, use paginate() with a FilterQuery. This is useful for batch processing or auditing your data.

from redisvl.query import FilterQuery
from redisvl.query.filter import FilterExpression

# Create a query that matches all documents
query = FilterQuery(
    filter_expression=FilterExpression("*"),
    return_fields=["user", "age", "job"]
)

# Paginate through all results
for batch in index.paginate(query, page_size=10):
    for doc in batch:
        print(f"Key: {doc['id']}, User: {doc['user']}")

Delete specific records

Use drop_keys() to remove specific records by their full Redis key, or drop_documents() to remove by document ID.

# Delete by full Redis key
full_key = index.key("john")
deleted_count = index.drop_keys(full_key)
print(f"Deleted {deleted_count} record(s) by key")

# Delete multiple keys at once
# index.drop_keys(["key1", "key2", "key3"])
# Delete by document ID (without the prefix)
deleted_count = index.drop_documents("mary")
print(f"Deleted {deleted_count} record(s) by document ID")

# Delete multiple documents at once
# index.drop_documents(["id1", "id2", "id3"])

Note: drop_keys() expects the full Redis key (including prefix), while drop_documents() expects just the document ID.

Creating VectorQuery Objects

Next we will create a vector query object for our newly populated index. This example will use a simple vector to demonstrate how vector similarity works. Vectors in production will likely be much larger than 3 floats and often require Machine Learning models (i.e. Huggingface sentence transformers) or an embeddings API (Cohere, OpenAI). redisvl provides a set of Vectorizers to assist in vector creation.

from redisvl.query import VectorQuery
from jupyterutils import result_print

query = VectorQuery(
    vector=[0.1, 0.1, 0.5],
    vector_field_name="user_embedding",
    return_fields=["user", "age", "job", "credit_score", "vector_distance"],
    num_results=3
)
vector_distanceuseragejobcredit_score
0john1engineerhigh
0mary2doctorlow
0.0566298961639tyler9engineerhigh

Note: For HNSW and SVS-VAMANA indexes, you can tune search performance using runtime parameters:

# Example with HNSW runtime parameters
query = VectorQuery(
    vector=[0.1, 0.1, 0.5],
    vector_field_name="user_embedding",
    return_fields=["user", "age", "job"],
    num_results=3,
    ef_runtime=50  # Higher for better recall (HNSW only)
)

See the SVS-VAMANA guide and Advanced Queries guide for more details on runtime parameters.

Executing queries

With our VectorQuery object defined above, we can execute the query over the SearchIndex using the query method.

results = index.query(query)
result_print(results)

Using an Asynchronous Redis Client

The AsyncSearchIndex class along with an async Redis python client allows for queries, index creation, and data loading to be done asynchronously. This is the recommended route for working with redisvl in production-like settings.

from redisvl.index import AsyncSearchIndex
from redis.asyncio import Redis

client = Redis.from_url("redis://localhost:6379")
index = AsyncSearchIndex.from_dict(schema, redis_client=client)
4
# execute the vector query async
results = await index.query(query)
result_print(results)
True

Updating a schema

In some scenarios, it makes sense to update the index schema. With Redis and redisvl, this is easy because Redis can keep the underlying data in place while you change or make updates to the index configuration.

So for our scenario, let's imagine we want to reindex this data in 2 ways:

  • by using a Tag type for job field instead of Text
  • by using an hnsw vector index for the user_embedding field instead of a flat vector index
# Modify this schema to have what we want
index.schema.remove_field("job")
index.schema.remove_field("user_embedding")
index.schema.add_fields([
    {"name": "job", "type": "tag"},
    {
        "name": "user_embedding",
        "type": "vector",
        "attrs": {
            "dims": 3,
            "distance_metric": "cosine",
            "algorithm": "hnsw",
            "datatype": "float32"
        }
    }
])
# Run the index update but keep underlying data in place
await index.create(overwrite=True, drop=False)
# Execute the vector query async
results = await index.query(query)
result_print(results)

Check Index Stats

Use the rvl CLI to check the stats for the index:

!rvl stats -i user_simple

Next Steps

Now that you understand the basics of RedisVL, explore these related guides:

Cleanup

Use .clear() to flush all data from Redis associated with the index while leaving the index in place for future insertions.

Use .delete() to remove both the index and the underlying data.

# Clear all data from Redis associated with the index
await index.clear()
# But the index is still in place
await index.exists()
# Remove / delete the index in its entirety
await index.delete()
RATE THIS PAGE
Back to top ↑