At Redis, we’re fast. To show how fast we are, we benchmarked the top providers in the market for vector databases using our new Redis Query Engine, now GA for Redis Software. This feature enhances our engine enabling concurrent access to the index, improving throughput for Redis queries, search, and vector database workloads. This blog post shows our benchmark results, explains the challenges with increasing query throughput, and how we overcame those with our new Redis Query Engine. This blog has three main parts:
Let’s start with what matters most, how fast Redis is.
Next to ingestion and index creation time, we benchmarked two key metrics: throughput and latency (see below the details about the metrics and principles) among 7 vector database players. Throughput indicates a system’s capability to process numerous queries or large datasets in a short amount of time, while latency measures how fast individual similarity searches return results.
To ensure we cover both, we’ve proceeded with two benchmarks, one multi-client benchmark focusing on throughput, and another single-client and under load (multi-client) benchmark focusing on latency. All the results can be filtered in the graphs below and all the details of how we conduct the tests can be explored in the blog. For Redis, prioritizing throughput and latency aligns with our core philosophy of delivering exceptional speed and reliability.
Our tests show that Redis is faster for vector database workloads compared to any other vector database we tested, at recall >= 0.98. Redis has 62% more throughput than the second-ranked database for lower-dimensional datasets (deep-image-96-angular) and has 21% more throughput for high-dimensional datasets (dbpedia-openai-1M-angular).
Caveat: MongoDB tests only provided results when using a smaller recall level for the gist-960-euclidean dataset. The results for this dataset considered a median of the results for recall between 0.82 and 0.98. For all other datasets we’re considering recall >=0.98
Redis outperformed other pure vector database providers in querying throughput and latency times.
Querying: Redis achieved up to 3.4 times higher queries per second (QPS) than Qdrant, 3.3 times higher QPS than Milvus, and 1.7 times higher QPS than Weaviate for the same recall levels. On latency, considered here as an average response for the multi-client test, Redis achieved up to 4 times less latency than Qdrant, 4.67 times than Milvus, and 1.71 times faster than Weaviate for the same recall levels. On latency, considered here as an average response under load (multi-client), Redis achieved up to 4 times less latency than Qdrant, 4.67 times than Milvus, and 1.71 times faster than Weaviate.
Ingestion and indexing: Qdrant is the fastest due to its multiple segments index design, but Redis excels in fast querying. Redis showed up to 2.8 times lower indexing time than Milvus and up to 3.2 times lower indexing time than Weaviate.
This section provides a comprehensive comparison between Redis 7.4 and other industry providers that exclusively have vector capabilities such as Milvus 2.4.1, Qdrant 1.7.4, and Weaviate 1.25.1.
In the graph below you can analyze all the results across RPS (request per second), latency (both single-client and multi-client, under load), P95 latency, and index time. All are measured across the different selected datasets.
There were some cases in single client benchmarks in which the performance of Redis and the competitors was at the same level. Weaviate and Milvus showcased operational problems on the cloud setups; findings are fully described in the appendix.
In the benchmarks for querying performance in general-purpose databases with vector similarity support, Redis significantly outperformed competitors.
Querying: Redis achieved up to 9.5 times higher queries per second (QPS) and up to 9.7 times lower latencies than Amazon Aurora PostgreSQL v16.1 with pgvector 0.5.1 for the same recall. Against MongoDB Atlas v7.0.8 with Atlas Search, Redis demonstrated up to 11 times higher QPS and up to 14.2 times lower latencies. Against Amazon OpenSearch, Redis demonstrated up to 53 times higher QPS and up to 53 times lower latencies.
Ingestion and indexing: Redis showed a substantial advantage over Amazon Aurora PostgreSQL v16.1 with pgvector 0.5.1, with indexing times ranging from 5.5 to 19 times lower.
This section is a comprehensive comparison between Redis 7.4 and Amazon Aurora PostgreSQL v16.1 with pgvector 0.5.1, as well as MongoDB Atlas v7.0.8 with Atlas Search, and Amazon OpenSearch 2.11, offering valuable insights into the performance of vector similarity searches in general-purpose DB cloud environments.
In the graph below you can analyze all the results across RPS (request per second), latency (both single-client and multi-client, under load) and P95 latency, and index time. All are measured across the different selected datasets.
Apart from the performance advantages showcased above, some general-purpose databases presented vector search limitations related to lack of precision and the possibility of index configuration, fully described in the appendix.
Compared to other Redis imitators, such as Amazon MemoryDB and Google Cloud MemoryStore for Redis, Redis demonstrates a significant performance advantage. This indicates that Redis and its enterprise implementations are optimized for performance, outpacing other providers that copied Redis.
Querying: Against Amazon MemoryDB, Redis achieved up to 3.9 times higher queries per second (QPS) and up to 4.1 times lower latencies for the same recall. Compared to GCP MemoryStore for Redis v7.2, Redis demonstrated up to 2.5 times higher QPS and up to 4.8 times lower latencies.
Ingestion and indexing: Redis had an advantage over Amazon MemoryDB with indexing times ranging from 1.39 to 3.89 times lower. Against GCP MemoryStore for Redis v7.2, Redis showed an even greater indexing advantage, with times ranging from 4.9 to 10.34 times lower.
In the graph below you can analyze all the results across: RPS (request per second), latency (both single-client and multi-client under load), P95 latency, and index time. All are measured across the different selected datasets.