{
  "id": "migrate-indexes",
  "title": "Migrate an Index",
  "url": "https://redis.io/docs/latest/develop/ai/redisvl/0.20.0/user_guide/how_to_guides/migrate-indexes/",
  "summary": "",
  "content": "\n\n\nThe index migrator is an **experimental** feature. APIs, CLI commands, and on-disk formats (plans, checkpoints, backups) may change in future releases. Review migration plans carefully before applying to production indexes.\n\n\nThis guide shows how to safely change your index schema using the RedisVL migrator.\n\n## Quick Start\n\nAdd a field to your index in 4 commands:\n\n```bash\n# 1. See what indexes exist\nrvl index listall --url redis://localhost:6379\n\n# 2. Use the wizard to build a migration plan\nrvl migrate wizard --index myindex --url redis://localhost:6379\n\n# 3. Apply the migration\nrvl migrate apply --plan migration_plan.yaml --backup-dir ./migration_backups --url redis://localhost:6379\n\n# 4. Verify the result\nrvl migrate validate --plan migration_plan.yaml --url redis://localhost:6379\n```\n\n## Prerequisites\n\n- Redis with the Search module (Redis Stack, Redis Cloud, or Redis Software)\n- An existing index to migrate\n- `redisvl` installed (`pip install redisvl`)\n\n```bash\n# Local development with Redis 8.0+ (recommended for full feature support)\ndocker run -d --name redis -p 6379:6379 redis:8.0\n```\n\n**Note:** Redis 8.0+ is required for INT8/UINT8 vector datatypes. SVS-VAMANA algorithm requires Redis 8.2+ and Intel AVX-512 hardware.\n\n## How It Works\n\nEvery migration follows the same three-phase flow: **describe what changed** (the patch),\n**generate a plan** (diffing the patch against the live schema), and **execute the plan**.\n\n### Single-Index Flow: wizard/plan then apply\n\n```default\nwizard (interactive)                   plan (non-interactive)\n        |                                    |\n        v                                    v\n  SchemaPatch YAML  \u003c----or----\u003e  SchemaPatch YAML\n        |                                    |\n        +------ planner.create_plan() -------+\n                       |\n                       v\n              MigrationPlan YAML\n                       |\n                       v\n              executor.apply()\n                       |\n                       v\n              MigrationReport YAML\n```\n\n**Phase 1: Build a SchemaPatch.**\nA patch is a small YAML file that declares *what you want to change*, not the full target schema.\nYou can build it interactively with `rvl migrate wizard`, or write it by hand. The patch has\nfive sections, each optional:\n\n| Patch Section   | What it does                                                                               |\n|-----------------|--------------------------------------------------------------------------------------------|\n| `add_fields`    | Adds new field definitions to the index                                                    |\n| `remove_fields` | Removes fields from the index (document data is kept, just no longer indexed)              |\n| `rename_fields` | Renames fields in both the index schema and all documents (HGET old, HSET new, HDEL old)   |\n| `update_fields` | Modifies field attributes: algorithm, datatype, distance metric, sortable, separator, etc. |\n| `index`         | Changes the index name or key prefix                                                       |\n\n**Phase 2: Generate a MigrationPlan.**\nThe planner connects to Redis, snapshots the live index schema and stats,\nthen merges the patch into the source schema to produce a `merged_target_schema`.\nIt classifies every change as supported or blocked and extracts rename operations.\n\nThe plan YAML contains:\n\n- `source`: frozen snapshot of the live index at planning time (schema, stats, key sample, prefixes)\n- `requested_changes`: the patch that was applied\n- `merged_target_schema`: source + patch = what the index will look like after migration\n- `diff_classification`: whether the migration is supported and any blocked reasons\n- `rename_operations`: extracted index renames, prefix changes, and field renames\n- `warnings`: any important notes (downtime required, lossy quantization, etc.)\n\nThe same patch produces different plans per index because each index has a different source schema.\n\n**Phase 3: Apply.**\nThe executor reads the plan and runs the migration steps:\n\n1. Enumerate keys (SCAN with source prefix)\n2. Field renames (pipelined HGET/HSET/HDEL)\n3. Prepare vector backups, if hash vector bytes will be quantized\n4. Drop index (FT.DROPINDEX, documents are preserved)\n5. Key prefix renames (RENAME or DUMP/RESTORE for cluster)\n6. Quantize vectors from backup (pipelined read/convert/write)\n7. Create index (FT.CREATE with merged target schema)\n8. Wait for re-indexing to complete\n9. Validate (doc count, schema match, key sample)\n\n`--backup-dir` / `backup_dir` is required before any apply starts. For\nquantization, the directory stores original vector bytes for resume and\nrollback. For index-only migrations, the directory is still validated and\nrecorded in the report, but no vector backup files are written.\n\n\nHash vector quantization is supported only when the Redis keys being\nquantized are not also indexed by another live RediSearch index that\nexpects the old vector datatype. Quantization rewrites vector bytes in\nthe document itself; any other index that covers the same key sees those\nnew bytes and may silently drop the document or fail to index it. If the\nsame documents are intentionally shared across multiple indexes, do not\nuse the migrator for that quantization change. Use an application-level\nmigration that creates new keys or fields and coordinates every affected\nindex schema.\n\n\n### Batch Flow: wizard/plan then batch-plan then batch-apply\n\nFor applying the same change across multiple indexes:\n\n```default\nSchemaPatch YAML  (shared, written once)\n        |\n        v\nbatch_planner.create_batch_plan()\n  for each index:\n    snapshot live schema\n    merge patch into source\n    if applicable: write per-index MigrationPlan\n    if not: mark skip_reason\n        |\n        v\nBatchPlan YAML\n  shared_patch: { ... }\n  indexes:\n    - name: idx_a, applicable: true, plan_path: plans/idx_a.yaml\n    - name: idx_b, applicable: true, plan_path: plans/idx_b.yaml\n    - name: idx_c, applicable: false, skip_reason: \"field not found\"\n        |\n        v\nbatch_executor.apply()\n  for each applicable index (sequentially):\n    executor.apply(per_index_plan)\n```\n\nThe batch planner takes a **single shared patch** and tests it against every target index.\nIndexes where the patch doesn’t apply (e.g., it references a field that doesn’t exist in that\nindex, or the change is blocked) are marked `applicable: false` with a `skip_reason` and skipped\nduring apply. Each applicable index gets its own full `MigrationPlan` written to disk.\n\nThis means you can review each per-index plan individually before running `batch-apply`.\n\n## Step 1: Discover Available Indexes\n\n```bash\nrvl index listall --url redis://localhost:6379\n```\n\n**Example output:**\n\n```default\nIndices:\n  1. products_idx\n  2. users_idx\n  3. orders_idx\n```\n\n## Step 2: Build Your Schema Change\n\nChoose one of these approaches:\n\n### Option A: Use the Wizard (Recommended)\n\nThe wizard guides you through building a migration interactively. Run:\n\n```bash\nrvl migrate wizard --index myindex --url redis://localhost:6379\n```\n\n**Example wizard session (adding a field):**\n\n```text\nBuilding a migration plan for index 'myindex'\nCurrent schema:\n- Index name: myindex\n- Storage type: hash\n  - title (text)\n  - embedding (vector)\n\nChoose an action:\n1. Add field        (text, tag, numeric, geo)\n2. Update field     (sortable, weight, separator, vector config)\n3. Remove field\n4. Rename field     (rename field in all documents)\n5. Rename index     (change index name)\n6. Change prefix    (rename all keys)\n7. Preview patch    (show pending changes as YAML)\n8. Finish\nEnter a number: 1\n\nField name: category\nField type options: text, tag, numeric, geo\nField type: tag\n  Sortable: enables sorting and aggregation on this field\nSortable [y/n]: n\n  Separator: character that splits multiple values (default: comma)\nSeparator [leave blank to keep existing/default]: |\n\nChoose an action:\n1. Add field        (text, tag, numeric, geo)\n2. Update field     (sortable, weight, separator, vector config)\n3. Remove field\n4. Rename field     (rename field in all documents)\n5. Rename index     (change index name)\n6. Change prefix    (rename all keys)\n7. Preview patch    (show pending changes as YAML)\n8. Finish\nEnter a number: 8\n\nMigration plan written to /path/to/migration_plan.yaml\nMode: drop_recreate\nSupported: True\nWarnings:\n- Index downtime is required\n```\n\n**Example wizard session (quantizing vectors):**\n\n```text\nChoose an action:\n1. Add field        (text, tag, numeric, geo)\n2. Update field     (sortable, weight, separator, vector config)\n3. Remove field\n4. Rename field     (rename field in all documents)\n5. Rename index     (change index name)\n6. Change prefix    (rename all keys)\n7. Preview patch    (show pending changes as YAML)\n8. Finish\nEnter a number: 2\n\nUpdatable fields:\n1. title (text)\n2. embedding (vector)\nSelect a field to update by number or name: 2\n\nCurrent vector config for 'embedding':\n  algorithm: HNSW\n  datatype: float32\n  distance_metric: cosine\n  dims: 384 (cannot be changed)\n  m: 16\n  ef_construction: 200\n\nLeave blank to keep current value.\n  Algorithm: vector search method (FLAT=brute force, HNSW=graph, SVS-VAMANA=compressed graph)\nAlgorithm [current: HNSW]:\n  Datatype: float16, float32, bfloat16, float64, int8, uint8\n            (float16 reduces memory ~50%, int8/uint8 reduce ~75%)\nDatatype [current: float32]: float16\n  Distance metric: how similarity is measured (cosine, l2, ip)\nDistance metric [current: cosine]:\n  M: number of connections per node (higher=better recall, more memory)\nM [current: 16]:\n  EF_CONSTRUCTION: build-time search depth (higher=better recall, slower build)\nEF_CONSTRUCTION [current: 200]:\n\nChoose an action:\n...\n8. Finish\nEnter a number: 8\n\nMigration plan written to /path/to/migration_plan.yaml\nMode: drop_recreate\nSupported: True\n```\n\n### Option B: Write a Schema Patch (YAML)\n\nCreate `schema_patch.yaml` manually:\n\n```yaml\nversion: 1\nchanges:\n  add_fields:\n    - name: category\n      type: tag\n      path: $.category\n      attrs:\n        separator: \"|\"\n  remove_fields:\n    - legacy_field\n  update_fields:\n    - name: title\n      attrs:\n        sortable: true\n    - name: embedding\n      attrs:\n        datatype: float16        # quantize vectors\n        algorithm: HNSW\n        distance_metric: cosine\n```\n\nThen generate the plan:\n\n```bash\nrvl migrate plan \\\n  --index myindex \\\n  --schema-patch schema_patch.yaml \\\n  --url redis://localhost:6379 \\\n  --plan-out migration_plan.yaml\n```\n\n### Option C: Provide a Target Schema\n\nIf you have the complete target schema, use it directly:\n\n```bash\nrvl migrate plan \\\n  --index myindex \\\n  --target-schema target_schema.yaml \\\n  --url redis://localhost:6379 \\\n  --plan-out migration_plan.yaml\n```\n\n## Step 3: Review the Migration Plan\n\nBefore applying, review `migration_plan.yaml`:\n\n```yaml\n# migration_plan.yaml (example)\nversion: 1\nmode: drop_recreate\n\nsource:\n  schema_snapshot:\n    index:\n      name: myindex\n      prefix: \"doc:\"\n      storage_type: json\n    fields:\n      - name: title\n        type: text\n      - name: embedding\n        type: vector\n        attrs:\n          dims: 384\n          algorithm: hnsw\n          datatype: float32\n  stats_snapshot:\n    num_docs: 10000\n  keyspace:\n    prefixes: [\"doc:\"]\n    key_sample: [\"doc:1\", \"doc:2\", \"doc:3\"]\n\nrequested_changes:\n  add_fields:\n    - name: category\n      type: tag\n\ndiff_classification:\n  supported: true\n  blocked_reasons: []\n\nrename_operations:\n  rename_index: null\n  change_prefix: null\n  rename_fields: []\n\nmerged_target_schema:\n  index:\n    name: myindex\n    prefix: \"doc:\"\n    storage_type: json\n  fields:\n    - name: title\n      type: text\n    - name: category\n      type: tag\n    - name: embedding\n      type: vector\n      attrs:\n        dims: 384\n        algorithm: hnsw\n        datatype: float32\n\nwarnings:\n  - \"Index downtime is required\"\n```\n\n**Key fields to check:**\n\n- `diff_classification.supported` - Must be `true` to proceed\n- `diff_classification.blocked_reasons` - Must be empty\n- `warnings` - Top-level warnings about the migration\n- `merged_target_schema` - The final schema after migration\n\n## Understanding Downtime Requirements\n\n**CRITICAL**: During a `drop_recreate` migration, your application must:\n\n| Requirement      | Description                                              |\n|------------------|----------------------------------------------------------|\n| **Pause reads**  | Index is unavailable during migration                    |\n| **Pause writes** | Writes during migration may be missed or cause conflicts |\n\n### Why Both Reads AND Writes Must Be Paused\n\n- **Reads**: The index definition is dropped and recreated. Any queries during this window will fail.\n- **Writes**: Redis updates indexes synchronously on every write. If your app writes documents while the index is dropped, those writes are not indexed. Additionally, if you’re quantizing vectors (float32 → float16), concurrent writes may conflict with the migration’s re-encoding process.\n\n### What \"Downtime\" Means\n\n| Downtime Type              | Reads   | Writes     | Safe?   |\n|----------------------------|---------|------------|---------|\n| Full quiesce (recommended) | Stopped | Stopped    | **YES** |\n| Read-only pause            | Stopped | Continuing | **NO**  |\n| Active                     | Active  | Active     | **NO**  |\n\n### Recovery from Interrupted Migration\n\n| Interruption Point                | Documents           | Index      | Recovery                                                  |\n|-----------------------------------|---------------------|------------|-----------------------------------------------------------|\n| After drop, before quantize       | Unchanged           | **None**   | Re-run apply with the same `--backup-dir`                 |\n| During quantization               | Partially quantized | **None**   | Re-run with same `--backup-dir` to resume from last batch |\n| After quantization, before create | Quantized           | **None**   | Re-run apply (will recreate index)                        |\n| After create                      | Correct             | Rebuilding | Wait for index ready                                      |\n\nThe underlying documents are **never deleted** by `drop_recreate` mode. `--backup-dir` is required for apply and enables crash-safe recovery for vector quantization. See [Crash-safe resume for quantization]() below.\n\n## Step 4: Apply the Migration\n\nThe `apply` command executes the migration. The index will be temporarily unavailable during the drop-recreate process.\n\n```bash\nrvl migrate apply \\\n  --plan migration_plan.yaml \\\n  --url redis://localhost:6379 \\\n  --backup-dir ./migration_backups \\\n  --report-out migration_report.yaml \\\n  --benchmark-out benchmark_report.yaml\n```\n\n### What `apply` does\n\nThe migration executor follows this sequence:\n\n**STEP 1: Enumerate keys** (before any modifications)\n\n- Discovers all document keys belonging to the source index\n- Uses `FT.AGGREGATE WITHCURSOR` for efficient enumeration\n- Falls back to `SCAN` if the index has indexing failures\n- Keys are stored in memory for quantization or rename operations\n\n**STEP 2: Field renames** (if renaming fields)\n\n- Renames document fields before the source index is dropped\n- Uses pipelined `HGET`/`HSET`/`HDEL` for Hash storage or JSON path updates for JSON storage\n- Skipped if the plan has no field rename operations\n\n**STEP 3: Back up original vectors** (if hash vector bytes will be quantized)\n\n- Single-worker hash quantization writes original vector bytes to `\u003cbackup-dir\u003e` before the index is dropped\n- Multi-worker hash quantization writes per-worker backup shards during the quantization phase after the drop\n- JSON datatype changes and index-only migrations validate and record `--backup-dir` but do not write vector backup files\n\n**STEP 4: Drop source index**\n\n- Issues `FT.DROPINDEX` to remove the index structure\n- **The underlying documents remain in Redis** - only the index metadata is deleted\n- After this point, the index is unavailable until the target index is recreated and ready\n\n**STEP 5: Key renames** (if changing key prefix)\n\n- If the migration changes the key prefix, renames each key from old prefix to new prefix\n- Skipped if no prefix change\n\n**STEP 6: Quantize vectors** (if changing hash vector datatype)\n\n- For each document in the enumerated key list:\n  - Reads the document (including the old vector)\n  - Converts the vector to the new datatype (e.g., float32 → float16)\n  - Writes back the converted vector to the same document\n- Processes documents in batches of 500 using Redis pipelines\n- Skipped for JSON storage (vectors are re-indexed automatically on recreate)\n- **Backup support**: `--backup-dir` is required and enables crash-safe recovery and rollback for vector quantization\n- **Shared-key limitation**: unsupported if the same Redis keys are also\n  indexed by another live index that expects the old vector datatype\n\n**STEP 7: Create target index**\n\n- Issues `FT.CREATE` with the merged target schema\n- Redis begins background indexing of existing documents\n\n**STEP 8: Wait for re-indexing**\n\n- Polls `FT.INFO` until indexing completes\n- The index becomes available for queries when this completes\n\n**Summary**: The migration preserves all documents, drops only the index structure, performs any document-level transformations (quantization, renames), then recreates the index with the new schema.\n\n### Async execution for large migrations\n\nFor large migrations (especially those involving vector quantization), use the `--async` flag:\n\n```bash\nrvl migrate apply \\\n  --plan migration_plan.yaml \\\n  --async \\\n  --backup-dir ./migration_backups \\\n  --url redis://localhost:6379\n```\n\n**What becomes async:**\n\n- Document enumeration during quantization (uses `FT.AGGREGATE WITHCURSOR` for index-specific enumeration, falling back to SCAN only if indexing failures exist)\n- Vector read/write operations (sequential async HGET, batched HSET via pipeline)\n- Index readiness polling (uses `asyncio.sleep()` instead of blocking)\n- Validation checks\n\n**What stays sync:**\n\n- CLI prompts and user interaction\n- YAML file reading/writing\n- Progress display\n\n**When to use async:**\n\n- Quantizing millions of vectors (float32 to float16)\n- Integrating into an async application\n\nFor most migrations (index-only changes, small datasets), sync mode is sufficient and simpler.\n\nSee [Index Migrations](https://redis.io/docs/latest/../../concepts/index-migrations) for detailed async vs sync guidance.\n\n### Crash-safe resume for quantization\n\nWhen migrating large datasets with vector quantization (e.g. float32 to float16), the re-encoding step can take minutes or hours. If the process is interrupted (crash, network drop, OOM kill), you don’t want to start over. The `--backup-dir` flag enables crash-safe recovery.\n\n#### How it works\n\nFor hash vector datatype changes, the migrator saves original vector bytes to disk before mutating them. Single-worker migrations create two files:\n\n```default\n\u003cbackup-dir\u003e/\n  migration_backup_\u003cindex_name\u003e.header   # JSON: phase, progress counters, field metadata\n  migration_backup_\u003cindex_name\u003e.data     # Binary: length-prefixed batches of original vectors\n```\n\nMulti-worker migrations also create a `.manifest` file at the canonical\nbackup path. The manifest records worker shard paths and key slices so a\nretry can resume even if the source index was already dropped.\n\nThe **header file** is a small JSON file that tracks progress through a state machine:\n\n```default\ndump → ready → index_dropped → active → completed → target_created → validated\n```\n\n- **dump**: original vectors are being read from Redis and written to the data file, one batch at a time\n- **ready**: all original vectors have been backed up; the source index may still be live\n- **index_dropped**: the source index definition has been dropped, but vectors have not all been rewritten\n- **active**: quantization is in progress; the header tracks which batches have been written back to Redis\n- **completed**: all batches have been quantized; target index creation may still be pending\n- **target_created**: the target index was recreated and Redis is re-indexing or ready for validation\n- **validated**: post-migration validation passed\n\nThe header is atomically updated (temp file + rename) after every batch, so a crash never corrupts it.\n\nThe **data file** is append-only binary. Each batch is stored as a 4-byte big-endian length prefix followed by a pickled blob containing the batch’s keys and their original vector bytes.\n\nOn resume, the executor loads the header, sees how many batches were already quantized (`quantize_completed_batches`), and skips ahead in the data file to continue from the next unfinished batch.\n\n**Disk usage:** approximately `num_docs × dims × bytes_per_element`. For example, 1M docs with 768-dim float32 vectors ≈ 2.9 GB.\n\n#### Step-by-step: using crash-safe resume\n\n**1. Estimate disk space (dry-run, no mutations):**\n\n```bash\nrvl migrate estimate --plan migration_plan.yaml\n```\n\nExample output:\n\n```text\nPre-migration disk space estimate:\n  Index: products_idx (1,000,000 documents)\n  Vector field 'embedding': 768 dims, float32 -\u003e float16\n\n  RDB snapshot (BGSAVE):        ~2.87 GB\n  AOF growth:                  not estimated (pass aof_enabled=True if AOF is on)\n  Total new disk required:      ~2.87 GB\n\n  Post-migration memory savings: ~1.43 GB (50% reduction)\n```\n\nIf AOF is enabled:\n\n```bash\nrvl migrate estimate --plan migration_plan.yaml --aof-enabled\n```\n\n**2. Apply with backup enabled:**\n\n```bash\nrvl migrate apply \\\n  --plan migration_plan.yaml \\\n  --backup-dir /tmp/migration_backups \\\n  --url redis://localhost:6379 \\\n  --report-out migration_report.yaml\n```\n\nThe `--backup-dir` flag takes a directory path. If no backup exists there, a new one is created. If one already exists (from a previous interrupted run), the migrator resumes from where it left off. A `completed` backup is treated as a no-op resume only when the live index already matches the target schema; after rollback, the live index matches the source schema, so the old completed backup is treated as stale and a fresh backup is written.\n\n**3. If the process crashes or is interrupted:**\n\nThe header file will contain the progress:\n\n```json\n{\n  \"index_name\": \"products_idx\",\n  \"fields\": {\"embedding\": {\"source\": \"float32\", \"target\": \"float16\", \"dims\": 768}},\n  \"batch_size\": 500,\n  \"phase\": \"active\",\n  \"dump_completed_batches\": 2000,\n  \"quantize_completed_batches\": 900\n}\n```\n\nThis tells you: all 2000 batches of original vectors were backed up, and 900 of them have been quantized so far.\n\n**4. Resume the migration:**\n\nRe-run the exact same command:\n\n```bash\nrvl migrate apply \\\n  --plan migration_plan.yaml \\\n  --backup-dir /tmp/migration_backups \\\n  --url redis://localhost:6379 \\\n  --report-out migration_report.yaml\n```\n\nThe migrator will:\n\n- Detect the existing backup and skip already-quantized batches\n- Continue quantizing from batch 901 onward\n- Print progress like `Quantize vectors: 450,000/1,000,000 docs`\n\n**5. On successful completion:**\n\nThe backup phase is set to `completed`. Backup files are **always retained** on disk for post-migration auditing and rollback. Delete them manually from `--backup-dir` once you have verified the migrated data and no longer need a recovery path.\n\n#### Limitations\n\n- **Same-width conversions** (float16 to bfloat16, or int8 to uint8) are **not supported** for resume. These conversions cannot be detected by byte-width inspection, so idempotent skip is impossible.\n- **Shared keys across indexes** are **not supported** for hash vector\n  quantization. The migrator mutates vector bytes in the Redis document\n  key; if another index also covers that key and still expects the old\n  datatype, the document may be dropped from that index or fail to\n  re-index.\n- **JSON storage** does not need vector re-encoding (Redis re-indexes JSON vectors on `FT.CREATE`). The backup directory is still required, validated, and recorded, but no vector backup files are written.\n- The backup must match the migration plan. If you change the plan, delete the old backup directory and start fresh.\n\n## Step 5: Validate the Result\n\nValidation happens automatically during `apply`, but you can run it separately:\n\n```bash\nrvl migrate validate \\\n  --plan migration_plan.yaml \\\n  --url redis://localhost:6379 \\\n  --report-out migration_report.yaml\n```\n\n**Validation checks:**\n\n- Live schema matches `merged_target_schema`\n- Document count matches the source snapshot\n- Sampled keys still exist\n- No increase in indexing failures\n\n## What’s Supported\n\n| Change                                                   | Supported   | Notes                                                                                                             |\n|----------------------------------------------------------|-------------|-------------------------------------------------------------------------------------------------------------------|\n| Add text/tag/numeric/geo field                           | ✅          |                                                                                                                   |\n| Remove a field                                           | ✅          |                                                                                                                   |\n| Rename a field                                           | ✅          | Renames field in all documents                                                                                    |\n| Change key prefix                                        | ✅          | Renames keys via RENAME command                                                                                   |\n| Rename the index                                         | ✅          | Index-only                                                                                                        |\n| Make a field sortable                                    | ✅          |                                                                                                                   |\n| Change field options (separator, stemming)               | ✅          |                                                                                                                   |\n| Change vector algorithm (FLAT ↔ HNSW ↔ SVS-VAMANA)       | ✅          | Index-only                                                                                                        |\n| Change distance metric (COSINE ↔ L2 ↔ IP)                | ✅          | Index-only                                                                                                        |\n| Tune HNSW parameters (M, EF_CONSTRUCTION)                | ✅          | Index-only                                                                                                        |\n| Quantize vectors (float32 → float16/bfloat16/int8/uint8) | ✅          | Auto re-encode; unsupported when the same Redis keys are indexed by another live index expecting the old datatype |\n\n## What’s Blocked\n\n| Change                            | Why                           | Workaround                           |\n|-----------------------------------|-------------------------------|--------------------------------------|\n| Change vector dimensions          | Requires re-embedding         | Re-embed with new model, reload data |\n| Change storage type (hash ↔ JSON) | Different data format         | Export, transform, reload            |\n| Add a new vector field            | Requires vectors for all docs | Add vectors first, then migrate      |\n\n## CLI Reference\n\n### Single-Index Commands\n\n| Command                | Description                                   |\n|------------------------|-----------------------------------------------|\n| `rvl migrate wizard`   | Build a migration interactively               |\n| `rvl migrate plan`     | Generate a migration plan                     |\n| `rvl migrate apply`    | Execute a migration                           |\n| `rvl migrate estimate` | Estimate disk space for a migration (dry-run) |\n| `rvl migrate validate` | Verify a migration result                     |\n\n### Batch Commands\n\n| Command                    | Description                   |\n|----------------------------|-------------------------------|\n| `rvl migrate batch-plan`   | Create a batch migration plan |\n| `rvl migrate batch-apply`  | Execute a batch migration     |\n| `rvl migrate batch-resume` | Resume an interrupted batch   |\n| `rvl migrate batch-status` | Check batch progress          |\n\n**Common flags:**\n\n- `--url` : Redis connection URL\n- `--index` : Index name to migrate\n- `--plan` / `--plan-out` : Path to migration plan\n- `--async` : Use async executor for large migrations (apply only)\n- `--report-out` : Path for validation report\n- `--benchmark-out` : Path for performance metrics\n\n**Apply flags (quantization \u0026 reliability):**\n\n- `--backup-dir \u003cdir\u003e` : Required migration backup directory. Hash vector datatype changes write vector backup files there for resume and rollback; index-only and JSON migrations validate and record the directory without writing vector backup files.\n- `--batch-size \u003cN\u003e` : Keys per pipeline batch (default 500). Values 200 to 1000 are typical.\n- `--workers \u003cN\u003e` : Parallel quantization workers (default 1). Each worker opens its own Redis connection. See [Performance]() for guidance.\n\n**Batch-specific flags:**\n\n- `--pattern` : Glob pattern to match index names (e.g., `*_idx`)\n- `--indexes` : Explicit list of index names\n- `--indexes-file` : File containing index names (one per line)\n- `--schema-patch` : Path to shared schema patch YAML\n- `--state` : Path to batch state file for resume\n- `--failure-policy` : `fail_fast` or `continue_on_error`\n- `--accept-data-loss` : Required for quantization (lossy changes)\n- `--retry-failed` : Retry previously failed indexes on resume\n\n## Troubleshooting\n\n### Migration blocked: \"unsupported change\"\n\nThe planner detected a change that requires data transformation. Check `diff_classification.blocked_reasons` in the plan for details.\n\n### Apply failed: \"source schema mismatch\"\n\nThe live index schema changed since the plan was generated. Re-run `rvl migrate plan` to create a fresh plan.\n\n### Apply failed: \"timeout waiting for index ready\"\n\nThe index is taking longer to rebuild than expected. This can happen with large datasets. Check Redis logs and consider increasing the timeout or running during lower traffic periods.\n\n### Validation failed: \"document count mismatch\"\n\nDocuments were added or removed between plan and apply. This is expected if your application is actively writing. Re-run `plan` and `apply` during a quieter period when the document count is stable, or verify the mismatch is due only to normal application traffic.\n\n### Quantized documents disappeared from another index\n\nThis topology is unsupported. Hash vector quantization rewrites vector\nbytes in the Redis document key. If another live RediSearch index also\ncovers that key and still expects the old vector datatype, Redis may drop\nthat document from the other index or report indexing failures for it.\n\nRecover by rolling back the vector bytes from the migration backup, then\nrecreate any affected index schemas. To perform the change safely, use an\napplication-level migration that writes new physical keys or new vector\nfields and coordinates all affected indexes before switching traffic.\n\n### batch-plan failed: \"overlapping indexes detected\"\n\n`batch-plan` refuses to write a plan when two or more applicable indexes\nshare a key prefix (one prefix is a literal string-prefix of the other,\nmatching `FT.CREATE PREFIX` semantics). Running such a batch would\ndouble-quantize the shared keys and corrupt vector data. The error lists\neach conflicting index pair under a `Conflicts:` section:\n\n```default\nError: Refusing to create batch plan: overlapping indexes detected.\n\nMultiple indexes in the batch share Redis key prefixes. Running a\nbatch migration over overlapping indexes can mutate the same keys\nmore than once (e.g., double-quantization of vectors), corrupting\nthe underlying data.\n\nConflicts:\n  - products_main \u003c-\u003e products_premium: 'product:' \u003c-\u003e 'product:premium:'\n\nResolve by migrating overlapping indexes one at a time, or by\nnarrowing the batch to a set of indexes with disjoint prefixes.\n```\n\nSplit the selected indexes into prefix-disjoint groups (for example,\n`prod_*` separately from `staging_*`) and run `batch-plan` once per group.\nIndexes that are skipped for other reasons (e.g. `applicable: false`\nbecause a field is missing) do not participate in this check.\n\n### How to recover from a failed migration\n\nIf `apply` fails mid-migration:\n\n1. **Check if the index exists:** `rvl index info --index myindex`\n2. **If the index exists but is wrong:** Re-run `apply` with the same plan\n3. **If the index was dropped:** Recreate it from the plan’s `merged_target_schema`\n\nThe underlying documents are never deleted by `drop_recreate`.\n\n## Backup, Resume \u0026 Rollback\n\n### How Backups Work\n\n`--backup-dir` / `backup_dir` is required for all migrations. If it is omitted\nor empty, the executor raises `ValueError` before any migration starts.\nMigration reports include the resolved backup directory and backup file\nprefixes. Batch checkpoint state also stores the backup directory used by the\nrun, and resume refuses a different directory for the same checkpoint.\n\nFor hash vector datatype changes, the migration executor saves **original\nvector bytes** to disk before mutating them. This enables two key capabilities:\n\n1. **Crash-safe resume**: if the process dies mid-migration, re-running the\n   same command with the same `--backup-dir` automatically detects partial\n   progress and resumes from the last completed batch.\n2. **Manual rollback**: the backup files contain the original (pre-quantization)\n   vector values, which can be restored to undo a migration.\n\nFor index-only migrations and JSON datatype changes, the directory is still\nvalidated and recorded, but no `.header` or `.data` vector backup files are\nwritten.\n\nBackup files are written to the specified directory with this layout:\n\n```default\n\u003cbackup-dir\u003e/\n  migration_backup_\u003cindex_name\u003e.header   # JSON: phase, progress counters, field metadata\n  migration_backup_\u003cindex_name\u003e.data     # Binary: length-prefixed batches of original vectors\n  migration_backup_\u003cindex_name\u003e.manifest # JSON: multi-worker shard resume metadata, when workers \u003e 1\n```\n\n**Disk usage:** approximately `num_docs × dims × bytes_per_element`.\nFor example, 1M docs with 768-dim float32 vectors ≈ 2.9 GB.\n\nBackup files are **always retained** on disk after a successful migration\nso they remain available for post-migration auditing and rollback. Delete\nthe files manually from the backup directory once you no longer need a\nrecovery path.\n\n### Crash-Safe Resume\n\nIf a migration is interrupted (crash, network error, Ctrl+C), simply re-run\nthe exact same command:\n\n```bash\n# Original command that was interrupted\nrvl migrate apply --plan plan.yaml --url redis://localhost:6379 \\\n  --backup-dir /tmp/backups --workers 4\n\n# Just re-run it. Progress is resumed automatically\nrvl migrate apply --plan plan.yaml --url redis://localhost:6379 \\\n  --backup-dir /tmp/backups --workers 4\n```\n\nThe executor detects the existing backup header, reads how many batches were\ncompleted, and resumes from the next unfinished batch. No data is duplicated\nor lost. If a retained completed backup is found after rollback, the executor\ndoes not skip the migration unless the live index already matches the target\nschema; it treats the completed backup as stale and starts a fresh backup.\n\n\n**Single-worker vs multi-worker resume:** In single-worker mode, the full\nbackup is written *before* the index is dropped, so a crash at any point\nleaves a complete backup on disk. In multi-worker mode, dump and quantize\nare fused (each worker reads, backs up, and converts its shard in one pass\n*after* the index drop). A crash during this fused phase may leave partial\nbackup shards. Re-running detects and resumes from partial state.\n\n\n### Rollback\n\nIf you need to undo a quantization migration and restore original vectors,\nuse the `rollback` command:\n\n```bash\nrvl migrate rollback --backup-dir /tmp/backups --url redis://localhost:6379\n```\n\nThis reads every batch from the backup files and pipeline-HSETs the original\n(pre-quantization) vector bytes back into Redis. After rollback completes:\n\n- Your vector data is restored to its original datatype\n- You will need to **manually recreate the original index schema** if the\n  index was changed during migration (the rollback command restores data\n  only, not the index definition)\n\n```bash\n# After rollback, recreate the original index if needed:\nrvl index create --schema original_schema.yaml --url redis://localhost:6379\n```\n\n\nRollback requires that the backup directory still contains the original\nbackup files. Backups are retained automatically after migration; do not\ndelete the directory until you are certain rollback is no longer needed.\n\n\n### Python API for Rollback\n\n```python\nfrom redisvl.migration.backup import VectorBackup\nimport redis\n\nr = redis.from_url(\"redis://localhost:6379\")\nbackup = VectorBackup.load(\"/tmp/backups/migration_backup_myindex\")\n\nfor keys, originals in backup.iter_batches():\n    pipe = r.pipeline(transaction=False)\n    for key in keys:\n        if key in originals:\n            for field_name, original_bytes in originals[key].items():\n                pipe.hset(key, field_name, original_bytes)\n    pipe.execute()\n\nprint(\"Rollback complete\")\n```\n\n## Python API\n\nFor programmatic migrations, use the migration classes directly:\n\n### Sync API\n\n```python\nfrom redisvl.migration import MigrationPlanner, MigrationExecutor\n\nplanner = MigrationPlanner()\nplan = planner.create_plan(\n    \"myindex\",\n    redis_url=\"redis://localhost:6379\",\n    schema_patch_path=\"schema_patch.yaml\",\n)\n\nexecutor = MigrationExecutor()\nreport = executor.apply(\n    plan,\n    redis_url=\"redis://localhost:6379\",\n    backup_dir=\"/tmp/migration_backups\",\n)\nprint(f\"Migration result: {report.result}\")\n```\n\nWith backup and multi-worker quantization:\n\n```python\nreport = executor.apply(\n    plan,\n    redis_url=\"redis://localhost:6379\",\n    backup_dir=\"/tmp/migration_backups\",   # enables crash-safe resume\n    batch_size=500,                        # keys per pipeline batch\n    num_workers=4,                         # parallel quantization workers\n)\nprint(f\"Quantized in {report.timings.quantize_duration_seconds}s\")\n```\n\n### Async API\n\n```python\nimport asyncio\nfrom redisvl.migration import AsyncMigrationPlanner, AsyncMigrationExecutor\n\nasync def migrate():\n    planner = AsyncMigrationPlanner()\n    plan = await planner.create_plan(\n        \"myindex\",\n        redis_url=\"redis://localhost:6379\",\n        schema_patch_path=\"schema_patch.yaml\",\n    )\n\n    executor = AsyncMigrationExecutor()\n    report = await executor.apply(\n        plan,\n        redis_url=\"redis://localhost:6379\",\n        backup_dir=\"/tmp/migration_backups\",\n        num_workers=4,\n    )\n    print(f\"Migration result: {report.result}\")\n\nasyncio.run(migrate())\n```\n\n## Batch Migration\n\nWhen you need to apply the same schema change to multiple indexes, use batch migration. This is common for:\n\n- Quantizing all indexes from float32 → float16\n- Standardizing vector algorithms across indexes\n- Coordinated migrations during maintenance windows\n\n### Quick Start: Batch Migration\n\n```bash\n# 1. Create a shared patch (applies to any index with an 'embedding' field)\ncat \u003e quantize_patch.yaml \u003c\u003c 'EOF'\nversion: 1\nchanges:\n  update_fields:\n    - name: embedding\n      attrs:\n        datatype: float16\nEOF\n\n# 2. Create a batch plan for all indexes matching a pattern\nrvl migrate batch-plan \\\n  --pattern \"*_idx\" \\\n  --schema-patch quantize_patch.yaml \\\n  --plan-out batch_plan.yaml \\\n  --url redis://localhost:6379\n\n# 3. Apply the batch plan\nrvl migrate batch-apply \\\n  --plan batch_plan.yaml \\\n  --backup-dir ./migration_backups \\\n  --accept-data-loss \\\n  --url redis://localhost:6379\n\n# 4. Check status\nrvl migrate batch-status --state batch_state.yaml\n```\n\n### Batch Plan Options\n\n**Select indexes by pattern:**\n\n```bash\nrvl migrate batch-plan \\\n  --pattern \"*_idx\" \\\n  --schema-patch quantize_patch.yaml \\\n  --plan-out batch_plan.yaml \\\n  --url redis://localhost:6379\n```\n\n**Select indexes by explicit list:**\n\n```bash\nrvl migrate batch-plan \\\n  --indexes \"products_idx,users_idx,orders_idx\" \\\n  --schema-patch quantize_patch.yaml \\\n  --plan-out batch_plan.yaml \\\n  --url redis://localhost:6379\n```\n\n**Select indexes from a file (for 100+ indexes):**\n\n```bash\n# Create index list file\necho -e \"products_idx\\nusers_idx\\norders_idx\" \u003e indexes.txt\n\nrvl migrate batch-plan \\\n  --indexes-file indexes.txt \\\n  --schema-patch quantize_patch.yaml \\\n  --plan-out batch_plan.yaml \\\n  --url redis://localhost:6379\n```\n\n### Batch Plan Review\n\nThe generated `batch_plan.yaml` shows which indexes will be migrated:\n\n```yaml\nversion: 1\nbatch_id: \"batch_20260320_100000\"\nmode: drop_recreate\nfailure_policy: fail_fast\nrequires_quantization: true\n\nshared_patch:\n  version: 1\n  changes:\n    update_fields:\n      - name: embedding\n        attrs:\n          datatype: float16\n\nindexes:\n  - name: products_idx\n    applicable: true\n    skip_reason: null\n  - name: users_idx\n    applicable: true\n    skip_reason: null\n  - name: legacy_idx\n    applicable: false\n    skip_reason: \"Field 'embedding' not found\"\n\ncreated_at: \"2026-03-20T10:00:00Z\"\n```\n\n**Key fields:**\n\n- `applicable: true` means the patch applies to this index\n- `skip_reason` explains why an index will be skipped\n\n**Overlap check.** `batch-plan` refuses to write a plan when two applicable\nindexes have key prefixes that overlap — i.e. one prefix is a literal\nstring-prefix of the other, matching `FT.CREATE PREFIX` semantics. Migrating\noverlapping indexes in a single batch can corrupt vector data because every\nindex after the first reads bytes that an earlier index has already\nquantized. Split the indexes into prefix-disjoint groups and create a batch\nplan per group. See the troubleshooting entry below for the exact error\nmessage.\n\n### Applying a Batch Plan\n\n```bash\n# Apply with fail-fast (default: stop on first error)\nrvl migrate batch-apply \\\n  --plan batch_plan.yaml \\\n  --backup-dir ./migration_backups \\\n  --accept-data-loss \\\n  --url redis://localhost:6379\n\n# Apply with continue-on-error (set at batch-plan time)\n# Note: failure_policy is set during batch-plan, not batch-apply\nrvl migrate batch-plan \\\n  --pattern \"*_idx\" \\\n  --schema-patch quantize_patch.yaml \\\n  --failure-policy continue_on_error \\\n  --plan-out batch_plan.yaml \\\n  --url redis://localhost:6379\n\nrvl migrate batch-apply \\\n  --plan batch_plan.yaml \\\n  --backup-dir ./migration_backups \\\n  --accept-data-loss \\\n  --url redis://localhost:6379\n```\n\n**Flags for batch-apply:**\n\n- `--accept-data-loss` : Required when quantizing vectors (float32 → float16 is lossy)\n- `--backup-dir` : Required directory for per-index backup metadata and vector backup files when hash vector bytes are mutated\n- `--state` : Path to batch state file (default: `batch_state.yaml`)\n- `--report-dir` : Directory for per-index reports (default: `./reports/`)\n\n**Note:** `--failure-policy` is set during `batch-plan`, not `batch-apply`. The policy is stored in the batch plan file.\n\n### Resume After Failure\n\nBatch migration automatically tracks progress in the state file. If interrupted:\n\n```bash\n# Resume from where it left off\nrvl migrate batch-resume \\\n  --state batch_state.yaml \\\n  --accept-data-loss \\\n  --url redis://localhost:6379\n\n# Retry previously failed indexes\nrvl migrate batch-resume \\\n  --state batch_state.yaml \\\n  --retry-failed \\\n  --accept-data-loss \\\n  --url redis://localhost:6379\n```\n\n`batch-resume` uses the `backup_dir` stored in `batch_state.yaml` unless you\npass `--backup-dir` explicitly. If you pass a different directory for the same\ncheckpoint, resume is rejected.\n\n**Note:** If the batch plan involves quantization (e.g., `float32` → `float16`), you must pass `--accept-data-loss` to `batch-resume`, just as with `batch-apply`. Omit `--accept-data-loss` if the batch plan does not involve quantization.\n\n### Checking Batch Status\n\n```bash\nrvl migrate batch-status --state batch_state.yaml\n```\n\n**Example output:**\n\n```default\nBatch Migration Status\n======================\nBatch ID: batch_20260320_100000\nStarted: 2026-03-20T10:00:00Z\nUpdated: 2026-03-20T10:25:00Z\n\nCompleted: 2\n  - products_idx: success (10:02:30)\n  - users_idx: failed - Redis connection timeout (10:05:45)\n\nIn Progress: inventory_idx\nRemaining: 1 (analytics_idx)\n```\n\n### Batch Report\n\nAfter completion, a `batch_report.yaml` is generated:\n\n```yaml\nversion: 1\nbatch_id: \"batch_20260320_100000\"\nstatus: completed  # or partial_failure, failed\nsummary:\n  total_indexes: 3\n  successful: 3\n  failed: 0\n  skipped: 0\n  total_duration_seconds: 127.5\nindexes:\n  - name: products_idx\n    status: success\n    report_path: ./reports/products_idx_report.yaml\n  - name: users_idx\n    status: success\n    report_path: ./reports/users_idx_report.yaml\n  - name: orders_idx\n    status: success\n    report_path: ./reports/orders_idx_report.yaml\ncompleted_at: \"2026-03-20T10:02:07Z\"\n```\n\n### Python API for Batch Migration\n\n```python\nfrom redisvl.migration import BatchMigrationPlanner, BatchMigrationExecutor\n\n# Create batch plan\nplanner = BatchMigrationPlanner()\nbatch_plan = planner.create_batch_plan(\n    redis_url=\"redis://localhost:6379\",\n    pattern=\"*_idx\",\n    schema_patch_path=\"quantize_patch.yaml\",\n)\n\n# Review applicability\nfor idx in batch_plan.indexes:\n    if idx.applicable:\n        print(f\"Will migrate: {idx.name}\")\n    else:\n        print(f\"Skipping {idx.name}: {idx.skip_reason}\")\n\n# Execute batch\nexecutor = BatchMigrationExecutor()\nreport = executor.apply(\n    batch_plan,\n    redis_url=\"redis://localhost:6379\",\n    state_path=\"batch_state.yaml\",\n    report_dir=\"./reports/\",\n    backup_dir=\"/tmp/migration_backups\",\n    progress_callback=lambda name, pos, total, status: print(f\"[{pos}/{total}] {name}: {status}\"),\n)\n\nprint(f\"Batch status: {report.status}\")\nprint(f\"Successful: {report.summary.successful}/{report.summary.total_indexes}\")\n```\n\n### Batch Migration Tips\n\n1. **Test on a single index first**: Run a single-index migration to verify the patch works before applying to a batch.\n2. **Use `continue_on_error` for large batches**: This ensures one failure doesn’t block all remaining indexes.\n3. **Schedule during low-traffic periods**: Each index has downtime during migration.\n4. **Review skipped indexes**: The `skip_reason` often indicates schema differences that need attention.\n5. **Keep state files**: The `batch_state.yaml` is essential for resume. Don’t delete it until the batch completes successfully.\n\n## Performance Tuning\n\n### Batch Size\n\nThe `--batch-size` flag controls how many keys are read/written per Redis\npipeline round-trip. The default of 500 is a good balance. Larger batches\n(1000+) reduce round-trips but increase per-batch memory and latency.\n\n### Backup Disk Space\n\nFor quantization migrations, original vectors are saved to `--backup-dir`\nbefore mutation. Approximate size: `num_docs × dims × bytes_per_element`.\n\n| Docs   |   Dims | Source dtype   | Backup size   |\n|--------|--------|----------------|---------------|\n| 100K   |    768 | float32        | ~292 MB       |\n| 1M     |    768 | float32        | ~2.9 GB       |\n| 1M     |   1536 | float32        | ~5.7 GB       |\n\n### HNSW vs FLAT Index Capacity\n\n\nWhen migrating from **HNSW** to **FLAT**, the target index may report a\n*higher* document count than the source. This is not a bug; it reflects\na fundamental difference in how the two algorithms store vectors.\n\n\nHNSW maintains a navigable small-world graph with per-node neighbor lists.\nThis graph overhead limits how many vectors can fit in available memory.\nFLAT stores vectors as a simple array with no graph overhead.\n\nIf the source HNSW index was operating near its memory capacity, some\ndocuments may have been registered in Redis Search’s document table but\nnot fully indexed into the HNSW graph. After migration to FLAT, those\nsame documents become fully searchable because FLAT requires less memory\nper vector.\n\nThe migration validator compares the total key count\n(`num_docs + hash_indexing_failures`) between source and target, so this\nscenario is handled correctly in the general case.\n\n## Learn more\n\n- [Index Migrations](https://redis.io/docs/latest/../../concepts/index-migrations): How migrations work and which changes are supported\n",
  "tags": [],
  "last_updated": "2026-06-11T16:10:09-04:00"
}