SVS-VAMANA Vector Search

In this notebook, we will explore SVS-VAMANA (Scalable Vector Search with VAMANA graph algorithm), a graph-based vector search algorithm that is optimized to work with compression methods to reduce memory usage. It combines the Vamana graph algorithm with advanced compression techniques (LVQ and LeanVec) and is optimized for Intel hardware.

How it works

Vamana builds a single-layer proximity graph and prunes edges during construction based on tunable parameters, similar to HNSW but with a simpler structure. The compression methods apply per-vector normalization and scalar quantization, learning parameters directly from the data to enable fast, on-the-fly distance computations with SIMD-optimized layout Vector quantization and compression.

SVS-VAMANA offers:

  • Fast approximate nearest neighbor search using graph-based algorithms
  • Vector compression (LVQ, LeanVec) with up to 87.5% memory savings
  • Dimensionality reduction (optional, with LeanVec)
  • Automatic performance optimization through CompressionAdvisor

Use SVS-VAMANA when:

  • Large datasets where memory is expensive
  • Cloud deployments with memory-based pricing
  • When 90-95% recall is acceptable
  • High-dimensional vectors (>1024 dims) with LeanVec compression

Table of Contents

  1. Prerequisites
  2. Quick Start with CompressionAdvisor
  3. Creating an SVS-VAMANA Index
  4. Loading Sample Data
  5. Performing Vector Searches
  6. Understanding Compression Types
  7. Hybrid Queries with SVS-VAMANA
  8. Performance Monitoring
  9. Manual Configuration (Advanced)
  10. Best Practices and Tips
  11. Cleanup

Prerequisites

Before running this notebook, ensure you have:

  1. Installed redisvl and have that environment active for this notebook
  2. A running Redis Stack instance with:
    • Redis >= 8.2.0
    • RediSearch >= 2.8.10

For example, you can run Redis Stack locally with Docker:

docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

Note: SVS-VAMANA only supports FLOAT16 and FLOAT32 datatypes.

# Import necessary modules
import numpy as np
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery
from redisvl.utils import CompressionAdvisor
from redisvl.redis.utils import array_to_buffer

# Set random seed for reproducible results
np.random.seed(42)
# Redis connection
REDIS_URL = "redis://localhost:6379"

Quick Start with CompressionAdvisor

The easiest way to get started with SVS-VAMANA is using the CompressionAdvisor utility, which automatically recommends optimal configuration based on your vector dimensions and performance priorities.

# Get recommended configuration for common embedding dimensions
dims = 1024  # Common embedding dimensions (works reliably with SVS-VAMANA)

config = CompressionAdvisor.recommend(
    dims=dims,
    priority="balanced"  # Options: "memory", "speed", "balanced"
)

print("Recommended Configuration:")
for key, value in config.items():
    print(f"  {key}: {value}")

# Estimate memory savings
savings = CompressionAdvisor.estimate_memory_savings(
    config["compression"],
    dims,
    config.get("reduce")
)
print(f"\nEstimated Memory Savings: {savings}%")
Recommended Configuration:
  algorithm: svs-vamana
  datatype: float16
  graph_max_degree: 64
  construction_window_size: 300
  compression: LeanVec4x8
  reduce: 512
  search_window_size: 30

Estimated Memory Savings: 81.2%

Creating an SVS-VAMANA Index

Let's create an index using the recommended configuration. We'll use a simple schema with text content and vector embeddings.

# Create index schema with recommended SVS-VAMANA configuration
schema = {
    "index": {
        "name": "svs_demo",
        "prefix": "doc",
    },
    "fields": [
        {"name": "content", "type": "text"},
        {"name": "category", "type": "tag"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": dims,
                **config,  # Use the recommended configuration
                "distance_metric": "cosine"
            }
        }
    ]
}

# Create the index
index = SearchIndex.from_dict(schema, redis_url=REDIS_URL)
index.create(overwrite=True)

print(f"✅ Created SVS-VAMANA index: {index.name}")
print(f"   Algorithm: {config['algorithm']}")
print(f"   Compression: {config['compression']}")
print(f"   Dimensions: {dims}")
if 'reduce' in config:
    print(f"   Reduced to: {config['reduce']} dimensions")
✅ Created SVS-VAMANA index: svs_demo
   Algorithm: svs-vamana
   Compression: LeanVec4x8
   Dimensions: 1024
   Reduced to: 512 dimensions

Loading Sample Data

Let's create some sample documents with embeddings to demonstrate SVS-VAMANA search capabilities.

# Generate sample data
sample_documents = [
    {"content": "Machine learning algorithms for data analysis", "category": "technology"},
    {"content": "Natural language processing and text understanding", "category": "technology"},
    {"content": "Computer vision and image recognition systems", "category": "technology"},
    {"content": "Delicious pasta recipes from Italy", "category": "food"},
    {"content": "Traditional French cooking techniques", "category": "food"},
    {"content": "Healthy meal planning and nutrition", "category": "food"},
    {"content": "Travel guide to European destinations", "category": "travel"},
    {"content": "Adventure hiking in mountain regions", "category": "travel"},
    {"content": "Cultural experiences in Asian cities", "category": "travel"},
    {"content": "Financial planning for retirement", "category": "finance"},
]

# Generate random embeddings for demonstration
# In practice, you would use a real embedding model
data_to_load = []

# Use reduced dimensions if LeanVec compression is applied
vector_dims = config.get("reduce", dims)
print(f"Creating vectors with {vector_dims} dimensions (reduced from {dims} if applicable)")

for i, doc in enumerate(sample_documents):
    # Create a random vector with some category-based clustering
    base_vector = np.random.random(vector_dims).astype(np.float32)
    
    # Add some category-based similarity (optional, for demo purposes)
    category_offset = hash(doc["category"]) % 100 / 1000.0
    base_vector[0] += category_offset
    
    # Convert to the datatype specified in config
    if config["datatype"] == "float16":
        base_vector = base_vector.astype(np.float16)
    
    data_to_load.append({
        "content": doc["content"],
        "category": doc["category"],
        "embedding": array_to_buffer(base_vector, dtype=config["datatype"])
    })

# Load data into the index
index.load(data_to_load)
print(f"✅ Loaded {len(data_to_load)} documents into the index")

# Wait a moment for indexing to complete
import time
time.sleep(2)

# Verify the data was loaded
info = index.info()
print(f"   Index now contains {info.get('num_docs', 0)} documents")
Creating vectors with 512 dimensions (reduced from 1024 if applicable)
✅ Loaded 10 documents into the index
   Index now contains 0 documents

Performing Vector Searches

Now let's perform some vector similarity searches using our SVS-VAMANA index.

# Create a query vector (in practice, this would be an embedding of your query text)
# Important: Query vector must match the index datatype and dimensions
vector_dims = config.get("reduce", dims)
if config["datatype"] == "float16":
    query_vector = np.random.random(vector_dims).astype(np.float16)
else:
    query_vector = np.random.random(vector_dims).astype(np.float32)

# Perform a vector similarity search
query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=5
)

results = index.query(query)

print("🔍 Vector Search Results:")
print("=" * 50)
for i, result in enumerate(results, 1):
    distance = result.get('vector_distance', 'N/A')
    print(f"{i}. [{result['category']}] {result['content']}")
    print(f"   Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f"   Distance: {distance}")
    print()
🔍 Vector Search Results:
==================================================

Understanding Compression Types

SVS-VAMANA supports different compression algorithms that trade off between memory usage and search quality. Let's explore the available options.

# Compare different compression priorities
print("Compression Recommendations for Different Priorities:")
print("=" * 60)

priorities = ["memory", "speed", "balanced"]
for priority in priorities:
    config = CompressionAdvisor.recommend(dims=dims, priority=priority)
    savings = CompressionAdvisor.estimate_memory_savings(
        config["compression"],
        dims,
        config.get("reduce")
    )
    
    print(f"\n{priority.upper()} Priority:")
    print(f"  Compression: {config['compression']}")
    print(f"  Datatype: {config['datatype']}")
    if "reduce" in config:
        print(f"  Dimensionality reduction: {dims}{config['reduce']}")
    print(f"  Search window size: {config['search_window_size']}")
    print(f"  Memory savings: {savings}%")
Compression Recommendations for Different Priorities:
============================================================

MEMORY Priority:
  Compression: LeanVec4x8
  Datatype: float16
  Dimensionality reduction: 1024 → 512
  Search window size: 20
  Memory savings: 81.2%

SPEED Priority:
  Compression: LeanVec4x8
  Datatype: float16
  Dimensionality reduction: 1024 → 256
  Search window size: 40
  Memory savings: 90.6%

BALANCED Priority:
  Compression: LeanVec4x8
  Datatype: float16
  Dimensionality reduction: 1024 → 512
  Search window size: 30
  Memory savings: 81.2%

Compression Types Explained

SVS-VAMANA offers several compression algorithms:

LVQ (Learned Vector Quantization)

  • LVQ4: 4 bits per dimension (87.5% memory savings)
  • LVQ4x4: 8 bits per dimension (75% memory savings)
  • LVQ4x8: 12 bits per dimension (62.5% memory savings)
  • LVQ8: 8 bits per dimension (75% memory savings)

LeanVec (Compression + Dimensionality Reduction)

  • LeanVec4x8: 12 bits per dimension + dimensionality reduction
  • LeanVec8x8: 16 bits per dimension + dimensionality reduction

The CompressionAdvisor automatically chooses the best compression type based on your vector dimensions and priority.

# Demonstrate compression savings for different vector dimensions
test_dimensions = [384, 768, 1024, 1536, 3072]

print("Memory Savings by Vector Dimension:")
print("=" * 50)
print(f"{'Dims':<6} {'Compression':<12} {'Savings':<8} {'Strategy'}")
print("-" * 50)

for dims in test_dimensions:
    config = CompressionAdvisor.recommend(dims=dims, priority="balanced")
    savings = CompressionAdvisor.estimate_memory_savings(
        config["compression"],
        dims,
        config.get("reduce")
    )
    
    strategy = "LeanVec" if dims >= 1024 else "LVQ"
    print(f"{dims:<6} {config['compression']:<12} {savings:>6.1f}% {strategy}")
Memory Savings by Vector Dimension:
==================================================
Dims   Compression  Savings  Strategy
--------------------------------------------------
384    LVQ4x4         75.0% LVQ
768    LVQ4x4         75.0% LVQ
1024   LeanVec4x8     81.2% LeanVec
1536   LeanVec4x8     81.2% LeanVec
3072   LeanVec4x8     81.2% LeanVec

Hybrid Queries with SVS-VAMANA

SVS-VAMANA can be combined with other Redis search capabilities for powerful hybrid queries that filter by metadata while performing vector similarity search.

# Perform a hybrid search: vector similarity + category filter
hybrid_query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=3
)

# Add a filter to only search within "technology" category
hybrid_query.set_filter("@category:{technology}")

filtered_results = index.query(hybrid_query)

print("🔍 Hybrid Search Results (Technology category only):")
print("=" * 55)
for i, result in enumerate(filtered_results, 1):
    distance = result.get('vector_distance', 'N/A')
    print(f"{i}. [{result['category']}] {result['content']}")
    print(f"   Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f"   Distance: {distance}")
    print()
🔍 Hybrid Search Results (Technology category only):
=======================================================

Performance Monitoring

Let's examine the index statistics to understand the performance characteristics of our SVS-VAMANA index.

# Get detailed index information
info = index.info()

print("📊 Index Statistics:")
print("=" * 30)
print(f"Documents: {info.get('num_docs', 0)}")

# Handle vector_index_sz_mb which might be a string
vector_size = info.get('vector_index_sz_mb', 0)
if isinstance(vector_size, str):
    try:
        vector_size = float(vector_size)
    except ValueError:
        vector_size = 0.0
print(f"Vector index size: {vector_size:.2f} MB")

# Handle total_indexing_time which might also be a string
indexing_time = info.get('total_indexing_time', 0)
if isinstance(indexing_time, str):
    try:
        indexing_time = float(indexing_time)
    except ValueError:
        indexing_time = 0.0
print(f"Total indexing time: {indexing_time:.2f} seconds")

# Calculate memory efficiency
if info.get('num_docs', 0) > 0 and vector_size > 0:
    mb_per_doc = vector_size / info.get('num_docs', 1)
    print(f"Memory per document: {mb_per_doc:.4f} MB")
    
    # Estimate for larger datasets
    for scale in [1000, 10000, 100000]:
        estimated_mb = mb_per_doc * scale
        print(f"Estimated size for {scale:,} docs: {estimated_mb:.1f} MB")
else:
    print("Memory efficiency calculation requires documents and vector index size > 0")
📊 Index Statistics:
==============================
Documents: 0
Vector index size: 0.00 MB
Total indexing time: 1.58 seconds
Memory efficiency calculation requires documents and vector index size > 0

Manual Configuration (Advanced)

For advanced users who want full control over SVS-VAMANA parameters, you can manually configure the algorithm instead of using CompressionAdvisor.

# Example of manual SVS-VAMANA configuration
manual_schema = {
    "index": {
        "name": "svs_manual",
        "prefix": "manual",
    },
    "fields": [
        {"name": "content", "type": "text"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": 768,
                "algorithm": "svs-vamana",
                "datatype": "float32",
                "distance_metric": "cosine",
                
                # Graph construction parameters
                "graph_max_degree": 64,           # Higher = better recall, more memory
                "construction_window_size": 300,  # Higher = better quality, slower build
                
                # Search parameters
                "search_window_size": 40,         # Higher = better recall, slower search
                
                # Compression settings
                "compression": "LVQ4x4",          # Choose compression type
                "training_threshold": 10000,      # Min vectors before compression training
            }
        }
    ]
}

print("Manual SVS-VAMANA Configuration:")
print("=" * 40)
vector_attrs = manual_schema["fields"][1]["attrs"]
for key, value in vector_attrs.items():
    if key != "dims":  # Skip dims as it's obvious
        print(f"  {key}: {value}")

# Calculate memory savings for this configuration
manual_savings = CompressionAdvisor.estimate_memory_savings(
    "LVQ4x4", 768, None
)
print(f"\nEstimated memory savings: {manual_savings}%")
Manual SVS-VAMANA Configuration:
========================================
  algorithm: svs-vamana
  datatype: float32
  distance_metric: cosine
  graph_max_degree: 64
  construction_window_size: 300
  search_window_size: 40
  compression: LVQ4x4
  training_threshold: 10000

Estimated memory savings: 75.0%

Best Practices and Tips

When to Use SVS-VAMANA

  • Large datasets (>10K vectors) where memory efficiency matters
  • High-dimensional vectors (>512 dimensions) that benefit from compression
  • Applications that can tolerate slight recall trade-offs for speed and memory savings

Parameter Tuning Guidelines

  • Start with CompressionAdvisor recommendations
  • Increase search_window_size if you need higher recall
  • Use LeanVec for high-dimensional vectors (≥1024 dims)
  • Use LVQ for lower-dimensional vectors (<1024 dims)

Performance Considerations

  • Index build time increases with higher construction_window_size
  • Search latency increases with higher search_window_size
  • Memory usage decreases with more aggressive compression
  • Recall quality may decrease with more aggressive compression

Cleanup

Clean up the indices created in this demo.

# Clean up demo indices
try:
    index.delete()
    print("Cleaned up svs_demo index")
except:
    print("- svs_demo index was already deleted or doesn't exist")
Cleaned up svs_demo index
RATE THIS PAGE
Back to top ↑