Scalability

Scale Redis vector sets to handle larger data sets and workloads

Multi-instance scalability

Vector sets can scale horizontally by sharding your data across multiple Redis instances. This is done by partitioning the dataset manually across keys and nodes.

Example strategy

You can shard data using a consistent hash:

key_index = crc32(item) % 3
key = f"vset:{key_index}"

Then add elements into different keys:

VADD vset:0 VALUES 3 0.1 0.2 0.3 item1
VADD vset:1 VALUES 3 0.4 0.5 0.6 item2

To run a similarity search across all shards, send VSIM commands to each key and then merge the results client-side:

VSIM vset:0 VALUES ... WITHSCORES
VSIM vset:1 VALUES ... WITHSCORES
VSIM vset:2 VALUES ... WITHSCORES

Then combine and sort the results by score.

Key properties

Write operations (VADD, VREM) scale linearly—you can insert in parallel across instances.
Read operations (VSIM) do not scale linearly—you must query all shards for a full result set.
Smaller vector sets yield faster queries, so distributing them helps reduce query time per node.
Merging results client-side keeps logic simple and doesn't add server-side overhead.

Availability benefits

This sharding model also improves fault tolerance:

If one instance is down, you can still retrieve partial results from others.
Use timeouts and partial fallbacks to increase resilience.

Latency considerations

To avoid additive latency across N instances:

Send queries to all shards in parallel.
Wait for the slowest response.

This makes total latency close to the worst-case shard time, not the sum of all times.

Summary

Goal	Approach
Scale inserts	Split data across keys and instances
Scale reads	Query all shards and merge results
High availability	Accept partial results when some shards fail
Maintain performance	Use smaller shards for faster per-node traversal