SVS-VAMANA Vector Search
In this notebook, we will explore SVS-VAMANA (Scalable Vector Search with VAMANA graph algorithm), a graph-based vector search algorithm that is optimized to work with compression methods to reduce memory usage. It combines the Vamana graph algorithm with advanced compression techniques (LVQ and LeanVec) and is optimized for Intel hardware.
How it works
Vamana builds a single-layer proximity graph and prunes edges during construction based on tunable parameters, similar to HNSW but with a simpler structure. The compression methods apply per-vector normalization and scalar quantization, learning parameters directly from the data to enable fast, on-the-fly distance computations with SIMD-optimized layout Vector quantization and compression.
SVS-VAMANA offers:
- Fast approximate nearest neighbor search using graph-based algorithms
- Vector compression (LVQ, LeanVec) with up to 87.5% memory savings
- Dimensionality reduction (optional, with LeanVec)
- Automatic performance optimization through CompressionAdvisor
Use SVS-VAMANA when:
- Large datasets where memory is expensive
- Cloud deployments with memory-based pricing
- When 90-95% recall is acceptable
- High-dimensional vectors (>1024 dims) with LeanVec compression
Table of Contents
- Prerequisites
- Quick Start with CompressionAdvisor
- Creating an SVS-VAMANA Index
- Loading Sample Data
- Performing Vector Searches
- Understanding Compression Types
- Hybrid Queries with SVS-VAMANA
- Performance Monitoring
- Manual Configuration (Advanced)
- Best Practices and Tips
- Cleanup
Prerequisites
Before running this notebook, ensure you have:
- Installed
redisvland have that environment active for this notebook - A running Redis Stack instance with:
- Redis >= 8.2.0
- RediSearch >= 2.8.10
For example, you can run Redis Stack locally with Docker:
docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
Note: SVS-VAMANA only supports FLOAT16 and FLOAT32 datatypes.
# Import necessary modules
import numpy as np
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery
from redisvl.utils import CompressionAdvisor
from redisvl.redis.utils import array_to_buffer
# Set random seed for reproducible results
np.random.seed(42)
# Redis connection
REDIS_URL = "redis://localhost:6379"
Quick Start with CompressionAdvisor
The easiest way to get started with SVS-VAMANA is using the CompressionAdvisor utility, which automatically recommends optimal configuration based on your vector dimensions and performance priorities.
# Get recommended configuration for common embedding dimensions
dims = 1024 # Common embedding dimensions (works reliably with SVS-VAMANA)
config = CompressionAdvisor.recommend(
dims=dims,
priority="balanced" # Options: "memory", "speed", "balanced"
)
print("Recommended Configuration:")
for key, value in config.items():
print(f" {key}: {value}")
# Estimate memory savings
savings = CompressionAdvisor.estimate_memory_savings(
config["compression"],
dims,
config.get("reduce")
)
print(f"\nEstimated Memory Savings: {savings}%")
Recommended Configuration:
algorithm: svs-vamana
datatype: float16
graph_max_degree: 64
construction_window_size: 300
compression: LeanVec4x8
reduce: 512
search_window_size: 30
Estimated Memory Savings: 81.2%
Creating an SVS-VAMANA Index
Let's create an index using the recommended configuration. We'll use a simple schema with text content and vector embeddings.
# Create index schema with recommended SVS-VAMANA configuration
schema = {
"index": {
"name": "svs_demo",
"prefix": "doc",
},
"fields": [
{"name": "content", "type": "text"},
{"name": "category", "type": "tag"},
{
"name": "embedding",
"type": "vector",
"attrs": {
"dims": dims,
**config, # Use the recommended configuration
"distance_metric": "cosine"
}
}
]
}
# Create the index
index = SearchIndex.from_dict(schema, redis_url=REDIS_URL)
index.create(overwrite=True)
print(f"✅ Created SVS-VAMANA index: {index.name}")
print(f" Algorithm: {config['algorithm']}")
print(f" Compression: {config['compression']}")
print(f" Dimensions: {dims}")
if 'reduce' in config:
print(f" Reduced to: {config['reduce']} dimensions")
✅ Created SVS-VAMANA index: svs_demo
Algorithm: svs-vamana
Compression: LeanVec4x8
Dimensions: 1024
Reduced to: 512 dimensions
Loading Sample Data
Let's create some sample documents with embeddings to demonstrate SVS-VAMANA search capabilities.
# Generate sample data
sample_documents = [
{"content": "Machine learning algorithms for data analysis", "category": "technology"},
{"content": "Natural language processing and text understanding", "category": "technology"},
{"content": "Computer vision and image recognition systems", "category": "technology"},
{"content": "Delicious pasta recipes from Italy", "category": "food"},
{"content": "Traditional French cooking techniques", "category": "food"},
{"content": "Healthy meal planning and nutrition", "category": "food"},
{"content": "Travel guide to European destinations", "category": "travel"},
{"content": "Adventure hiking in mountain regions", "category": "travel"},
{"content": "Cultural experiences in Asian cities", "category": "travel"},
{"content": "Financial planning for retirement", "category": "finance"},
]
# Generate random embeddings for demonstration
# In practice, you would use a real embedding model
data_to_load = []
# Use reduced dimensions if LeanVec compression is applied
vector_dims = config.get("reduce", dims)
print(f"Creating vectors with {vector_dims} dimensions (reduced from {dims} if applicable)")
for i, doc in enumerate(sample_documents):
# Create a random vector with some category-based clustering
base_vector = np.random.random(vector_dims).astype(np.float32)
# Add some category-based similarity (optional, for demo purposes)
category_offset = hash(doc["category"]) % 100 / 1000.0
base_vector[0] += category_offset
# Convert to the datatype specified in config
if config["datatype"] == "float16":
base_vector = base_vector.astype(np.float16)
data_to_load.append({
"content": doc["content"],
"category": doc["category"],
"embedding": array_to_buffer(base_vector, dtype=config["datatype"])
})
# Load data into the index
index.load(data_to_load)
print(f"✅ Loaded {len(data_to_load)} documents into the index")
# Wait a moment for indexing to complete
import time
time.sleep(2)
# Verify the data was loaded
info = index.info()
print(f" Index now contains {info.get('num_docs', 0)} documents")
Creating vectors with 512 dimensions (reduced from 1024 if applicable)
✅ Loaded 10 documents into the index
Index now contains 0 documents
Performing Vector Searches
Now let's perform some vector similarity searches using our SVS-VAMANA index.
# Create a query vector (in practice, this would be an embedding of your query text)
# Important: Query vector must match the index datatype and dimensions
vector_dims = config.get("reduce", dims)
if config["datatype"] == "float16":
query_vector = np.random.random(vector_dims).astype(np.float16)
else:
query_vector = np.random.random(vector_dims).astype(np.float32)
# Perform a vector similarity search
query = VectorQuery(
vector=query_vector.tolist(),
vector_field_name="embedding",
return_fields=["content", "category"],
num_results=5
)
results = index.query(query)
print("🔍 Vector Search Results:")
print("=" * 50)
for i, result in enumerate(results, 1):
distance = result.get('vector_distance', 'N/A')
print(f"{i}. [{result['category']}] {result['content']}")
print(f" Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f" Distance: {distance}")
print()
🔍 Vector Search Results:
==================================================
Understanding Compression Types
SVS-VAMANA supports different compression algorithms that trade off between memory usage and search quality. Let's explore the available options.
# Compare different compression priorities
print("Compression Recommendations for Different Priorities:")
print("=" * 60)
priorities = ["memory", "speed", "balanced"]
for priority in priorities:
config = CompressionAdvisor.recommend(dims=dims, priority=priority)
savings = CompressionAdvisor.estimate_memory_savings(
config["compression"],
dims,
config.get("reduce")
)
print(f"\n{priority.upper()} Priority:")
print(f" Compression: {config['compression']}")
print(f" Datatype: {config['datatype']}")
if "reduce" in config:
print(f" Dimensionality reduction: {dims} → {config['reduce']}")
print(f" Search window size: {config['search_window_size']}")
print(f" Memory savings: {savings}%")
Compression Recommendations for Different Priorities:
============================================================
MEMORY Priority:
Compression: LeanVec4x8
Datatype: float16
Dimensionality reduction: 1024 → 512
Search window size: 20
Memory savings: 81.2%
SPEED Priority:
Compression: LeanVec4x8
Datatype: float16
Dimensionality reduction: 1024 → 256
Search window size: 40
Memory savings: 90.6%
BALANCED Priority:
Compression: LeanVec4x8
Datatype: float16
Dimensionality reduction: 1024 → 512
Search window size: 30
Memory savings: 81.2%
Compression Types Explained
SVS-VAMANA offers several compression algorithms:
LVQ (Learned Vector Quantization)
- LVQ4: 4 bits per dimension (87.5% memory savings)
- LVQ4x4: 8 bits per dimension (75% memory savings)
- LVQ4x8: 12 bits per dimension (62.5% memory savings)
- LVQ8: 8 bits per dimension (75% memory savings)
LeanVec (Compression + Dimensionality Reduction)
- LeanVec4x8: 12 bits per dimension + dimensionality reduction
- LeanVec8x8: 16 bits per dimension + dimensionality reduction
The CompressionAdvisor automatically chooses the best compression type based on your vector dimensions and priority.
# Demonstrate compression savings for different vector dimensions
test_dimensions = [384, 768, 1024, 1536, 3072]
print("Memory Savings by Vector Dimension:")
print("=" * 50)
print(f"{'Dims':<6} {'Compression':<12} {'Savings':<8} {'Strategy'}")
print("-" * 50)
for dims in test_dimensions:
config = CompressionAdvisor.recommend(dims=dims, priority="balanced")
savings = CompressionAdvisor.estimate_memory_savings(
config["compression"],
dims,
config.get("reduce")
)
strategy = "LeanVec" if dims >= 1024 else "LVQ"
print(f"{dims:<6} {config['compression']:<12} {savings:>6.1f}% {strategy}")
Memory Savings by Vector Dimension:
==================================================
Dims Compression Savings Strategy
--------------------------------------------------
384 LVQ4x4 75.0% LVQ
768 LVQ4x4 75.0% LVQ
1024 LeanVec4x8 81.2% LeanVec
1536 LeanVec4x8 81.2% LeanVec
3072 LeanVec4x8 81.2% LeanVec
Hybrid Queries with SVS-VAMANA
SVS-VAMANA can be combined with other Redis search capabilities for powerful hybrid queries that filter by metadata while performing vector similarity search.
# Perform a hybrid search: vector similarity + category filter
hybrid_query = VectorQuery(
vector=query_vector.tolist(),
vector_field_name="embedding",
return_fields=["content", "category"],
num_results=3
)
# Add a filter to only search within "technology" category
hybrid_query.set_filter("@category:{technology}")
filtered_results = index.query(hybrid_query)
print("🔍 Hybrid Search Results (Technology category only):")
print("=" * 55)
for i, result in enumerate(filtered_results, 1):
distance = result.get('vector_distance', 'N/A')
print(f"{i}. [{result['category']}] {result['content']}")
print(f" Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f" Distance: {distance}")
print()
🔍 Hybrid Search Results (Technology category only):
=======================================================
Performance Monitoring
Let's examine the index statistics to understand the performance characteristics of our SVS-VAMANA index.
# Get detailed index information
info = index.info()
print("📊 Index Statistics:")
print("=" * 30)
print(f"Documents: {info.get('num_docs', 0)}")
# Handle vector_index_sz_mb which might be a string
vector_size = info.get('vector_index_sz_mb', 0)
if isinstance(vector_size, str):
try:
vector_size = float(vector_size)
except ValueError:
vector_size = 0.0
print(f"Vector index size: {vector_size:.2f} MB")
# Handle total_indexing_time which might also be a string
indexing_time = info.get('total_indexing_time', 0)
if isinstance(indexing_time, str):
try:
indexing_time = float(indexing_time)
except ValueError:
indexing_time = 0.0
print(f"Total indexing time: {indexing_time:.2f} seconds")
# Calculate memory efficiency
if info.get('num_docs', 0) > 0 and vector_size > 0:
mb_per_doc = vector_size / info.get('num_docs', 1)
print(f"Memory per document: {mb_per_doc:.4f} MB")
# Estimate for larger datasets
for scale in [1000, 10000, 100000]:
estimated_mb = mb_per_doc * scale
print(f"Estimated size for {scale:,} docs: {estimated_mb:.1f} MB")
else:
print("Memory efficiency calculation requires documents and vector index size > 0")
📊 Index Statistics:
==============================
Documents: 0
Vector index size: 0.00 MB
Total indexing time: 1.58 seconds
Memory efficiency calculation requires documents and vector index size > 0
Manual Configuration (Advanced)
For advanced users who want full control over SVS-VAMANA parameters, you can manually configure the algorithm instead of using CompressionAdvisor.
# Example of manual SVS-VAMANA configuration
manual_schema = {
"index": {
"name": "svs_manual",
"prefix": "manual",
},
"fields": [
{"name": "content", "type": "text"},
{
"name": "embedding",
"type": "vector",
"attrs": {
"dims": 768,
"algorithm": "svs-vamana",
"datatype": "float32",
"distance_metric": "cosine",
# Graph construction parameters
"graph_max_degree": 64, # Higher = better recall, more memory
"construction_window_size": 300, # Higher = better quality, slower build
# Search parameters
"search_window_size": 40, # Higher = better recall, slower search
# Compression settings
"compression": "LVQ4x4", # Choose compression type
"training_threshold": 10000, # Min vectors before compression training
}
}
]
}
print("Manual SVS-VAMANA Configuration:")
print("=" * 40)
vector_attrs = manual_schema["fields"][1]["attrs"]
for key, value in vector_attrs.items():
if key != "dims": # Skip dims as it's obvious
print(f" {key}: {value}")
# Calculate memory savings for this configuration
manual_savings = CompressionAdvisor.estimate_memory_savings(
"LVQ4x4", 768, None
)
print(f"\nEstimated memory savings: {manual_savings}%")
Manual SVS-VAMANA Configuration:
========================================
algorithm: svs-vamana
datatype: float32
distance_metric: cosine
graph_max_degree: 64
construction_window_size: 300
search_window_size: 40
compression: LVQ4x4
training_threshold: 10000
Estimated memory savings: 75.0%
Best Practices and Tips
When to Use SVS-VAMANA
- Large datasets (>10K vectors) where memory efficiency matters
- High-dimensional vectors (>512 dimensions) that benefit from compression
- Applications that can tolerate slight recall trade-offs for speed and memory savings
Parameter Tuning Guidelines
- Start with CompressionAdvisor recommendations
- Increase search_window_size if you need higher recall
- Use LeanVec for high-dimensional vectors (≥1024 dims)
- Use LVQ for lower-dimensional vectors (<1024 dims)
Performance Considerations
- Index build time increases with higher construction_window_size
- Search latency increases with higher search_window_size
- Memory usage decreases with more aggressive compression
- Recall quality may decrease with more aggressive compression
Cleanup
Clean up the indices created in this demo.
# Clean up demo indices
try:
index.delete()
print("Cleaned up svs_demo index")
except:
print("- svs_demo index was already deleted or doesn't exist")
Cleaned up svs_demo index