{
  "id": "snowflake",
  "title": "Prepare Snowflake for RDI",
  "url": "https://redis.io/docs/latest/integrate/redis-data-integration/data-pipelines/prepare-dbs/snowflake/",
  "summary": "Prepare Snowflake databases to work with RDI",
  "tags": [
    "docs",
    "integrate",
    "rs",
    "rdi"
  ],
  "last_updated": "2026-05-12T09:07:59-04:00",
  "page_type": "content",
  "content_hash": "24884b2c458023686c3d4bea7c66d2af979e3b7d86143c0c6a2621f09b0a0010",
  "sections": [
    {
      "id": "overview",
      "title": "Overview",
      "role": "overview",
      "text": "This guide describes the steps required to prepare a Snowflake database as a source for Redis Data Integration (RDI) pipelines.\n\nDuring both the [snapshot](https://redis.io/docs/latest/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle) and\n[Change data capture (CDC)](https://redis.io/docs/latest/integrate/redis-data-integration/data-pipelines#pipeline-lifecycle)\nphases, RDI uses [Snowflake Streams](https://docs.snowflake.com/en/user-guide/streams) to read data from the monitored\ntables. For the initial snapshot, RDI creates the stream with `SHOW_INITIAL_ROWS = TRUE` so it can read the current\ntable contents before continuing with ongoing CDC. RDI automatically creates and manages the required streams."
    },
    {
      "id": "setup",
      "title": "Setup",
      "role": "setup",
      "text": "The following checklist shows the steps to prepare a Snowflake database for RDI,\nwith links to the sections that explain the steps in full detail.\nYou may find it helpful to track your progress with the checklist as you\ncomplete each step.\n\n\nSnowflake is only supported with RDI deployed on Kubernetes/Helm. RDI VM mode does not support Snowflake as a source database.\n\n\n[code example]"
    },
    {
      "id": "1-set-up-snowflake-permissions",
      "title": "1. Set up Snowflake permissions",
      "role": "content",
      "text": "The following are the minimum runtime permissions for the RDI role to read the source tables and create the Snowflake\nobjects RDI uses for CDC:\n\n- `USAGE`, `OPERATE` on the warehouse used for RDI reads\n- `USAGE` on the source database and source schema\n- `SELECT` on the source tables\n- `USAGE` on the CDC schema used by RDI\n- `CREATE STREAM`, `CREATE TABLE` on the CDC schema used by RDI\n\nIf you configure `cdcDatabase` and `cdcSchema`, grant the CDC permissions there. Otherwise, grant them in the source\nschema. If your Snowflake setup requires it, also grant any additional cross-database privileges needed for the CDC\nschema to reference the source tables.\n\n\nRDI manages the Snowflake streams it uses for snapshot and CDC. The collector creates the stream in the configured CDC\nschema and later issues `CREATE OR REPLACE STREAM` statements to keep the stream aligned with the expected offset, so\nthe RDI role must be able to create and own those stream objects in the CDC schema.\n\nThere is one stricter bootstrap requirement for the first stream created on a source table: if Snowflake change\ntracking is not already enabled on that table, only the table owner can create that initial stream. If the source\ntables are not owned by the RDI role, ask a Snowflake administrator or table owner to enable change tracking first:\n\n[code example]\n\n\nGrant the required permissions to your RDI user:\n\n[code example]\n\nIf you use centralized grant management, you can also add future grants in the CDC schema so newly created tables and\nstreams automatically receive the desired privileges. These grants are optional and are not part of the minimum runtime\npermissions:\n\n[code example]"
    },
    {
      "id": "2-configure-authentication",
      "title": "2. Configure authentication",
      "role": "content",
      "text": "RDI supports two authentication methods for Snowflake. You must configure one of these methods."
    },
    {
      "id": "password-authentication",
      "title": "Password authentication",
      "role": "content",
      "text": "Use standard username and password credentials. Store these securely using Kubernetes secrets (see step 3).\n\n\nMany Snowflake accounts require MFA for password-based sign-ins. If you want to use password authentication for RDI,\nconfigure the Snowflake user as a service user that is allowed to authenticate non-interactively. Otherwise, use\nprivate key authentication instead. For more information, see the Snowflake\n[MFA rollout documentation](https://docs.snowflake.com/en/user-guide/security-mfa-rollout)."
    },
    {
      "id": "private-key-authentication",
      "title": "Private key authentication",
      "role": "content",
      "text": "For enhanced security, use key-pair authentication:\n\n1. Generate a private key:\n\n    [code example]\n\n1. Generate the public key:\n\n    [code example]\n\n1. Register the public key with your Snowflake user:\n\n    [code example]"
    },
    {
      "id": "3-set-up-secrets-for-kubernetes-deployment",
      "title": "3. Set up secrets for Kubernetes deployment",
      "role": "content",
      "text": "Before deploying the RDI pipeline, configure the necessary secrets."
    },
    {
      "id": "password-authentication",
      "title": "Password authentication",
      "role": "content",
      "text": "[code example]"
    },
    {
      "id": "private-key-authentication",
      "title": "Private key authentication",
      "role": "content",
      "text": "Create a secret with the private key file:\n\n[code example]\n\nAlso create the source-db secret with the username:\n\n[code example]"
    },
    {
      "id": "4-configure-rdi-for-snowflake",
      "title": "4. Configure RDI for Snowflake",
      "role": "content",
      "text": "Use the following example configuration in your `config.yaml` file:\n\n[code example]\n\n\nSnowflake uses one configured `database` and one or more source-level `schemas`. In the `tables` section, specify each\ntable as `SCHEMA.table`. Even when you configure only one schema, explicit `SCHEMA.table` names are recommended for\nclarity."
    },
    {
      "id": "snowflake-connection-properties",
      "title": "Snowflake connection properties",
      "role": "content",
      "text": "| Property      | Type   | Required | Description                                                    |\n|---------------|--------|----------|----------------------------------------------------------------|\n| `type`        | string | Yes      | Must be `\"snowflake\"`                                          |\n| `url`         | string | Yes      | JDBC URL: `jdbc:snowflake://<account>.snowflakecomputing.com/` |\n| `user`        | string | Yes      | Snowflake user                                                 |\n| `password`    | string | No*      | Snowflake password                                             |\n| `database`    | string | Yes      | Snowflake database name                                        |\n| `warehouse`   | string | Yes      | Snowflake warehouse name                                       |\n| `role`        | string | No       | Snowflake role name                                            |\n| `cdcDatabase` | string | No       | Database for CDC streams (if different from source)            |\n| `cdcSchema`   | string | No       | Schema for CDC streams (if different from source)              |\n\n* Either `password` or private key authentication is required. See [Configure authentication](#2-configure-authentication) for details."
    },
    {
      "id": "snowflake-source-properties",
      "title": "Snowflake source properties",
      "role": "content",
      "text": "| Property   | Type   | Required | Description                                                      |\n|------------|--------|----------|------------------------------------------------------------------|\n| `schemas`  | array  | Yes      | Schema names to capture from                                     |\n| `tables`   | object | Yes      | Tables to capture, keyed as `SCHEMA.table`                       |"
    },
    {
      "id": "advanced-configuration-options",
      "title": "Advanced configuration options",
      "role": "content",
      "text": "Configure under `sources.<name>.advanced.riotx`:\n\n| Property       | Type    | Default     | Description                                  |\n|----------------|---------|-------------|----------------------------------------------|\n| `poll`         | string  | `\"30s\"`     | Polling interval for stream changes          |\n| `snapshot`     | string  | `\"INITIAL\"` | Snapshot mode: `INITIAL` or `NEVER`          |\n| `streamPrefix` | string  | `\"data:\"`   | Prefix for the Redis stream written by RDI   |\n| `streamLimit`  | integer | -           | Maximum stream length (XTRIM MAXLEN)         |\n| `keyColumns`   | array   | -           | Stable source columns to use as message keys |\n| `clearOffset`  | boolean | `false`     | Clear existing offset on start               |\n| `count`        | integer | `0`         | Limit records per poll (0 = unlimited)       |\n\nFor reliable update and delete handling, define `keyColumns` with a stable business key or surrogate key when possible."
    },
    {
      "id": "troubleshooting",
      "title": "Troubleshooting",
      "role": "errors",
      "text": ""
    },
    {
      "id": "connection-issues",
      "title": "Connection issues",
      "role": "content",
      "text": "**Error: \"Failed to connect to Snowflake\"**\n\n- Verify the account URL is correct (format: `<account>.snowflakecomputing.com`)\n- Check network connectivity to Snowflake\n- Verify the warehouse is running and accessible\n- Check firewall rules allow outbound HTTPS (port 443)\n\n**Error: \"Authentication failed\"**\n\n- For password auth: verify username and password are correct\n- For key-pair auth: verify the private key matches the public key registered in Snowflake\n- Ensure the user has appropriate permissions\n\n**Error: \"Warehouse not found\"**\n\n- Verify the warehouse name is correct\n- Ensure the user has USAGE permission on the warehouse"
    },
    {
      "id": "cdc-issues",
      "title": "CDC issues",
      "role": "content",
      "text": "**No data appearing in Redis**\n\n1. Verify Snowflake Streams exist in the CDC schema:\n\n    [code example]\n\n1. Check the polling interval configuration\n1. Verify Redis connection is working\n1. Check the collector logs:\n\n    [code example]\n\n**Stale or missing changes**\n\n- Snowflake Streams depend on Snowflake change tracking and retention settings\n- If the collector was offline longer than the available retention window, changes may be lost\n- Consider using `clearOffset: true` to restart from current state"
    },
    {
      "id": "performance-tuning",
      "title": "Performance tuning",
      "role": "performance",
      "text": "**High Snowflake warehouse usage**\n\n- Increase `poll` interval (e.g., `\"60s\"` or `\"120s\"`)\n- Use a dedicated warehouse for CDC operations\n- Each poll first calls Snowflake's `SYSTEM$STREAM_HAS_DATA` function to check whether the stream has new data. This\n  check does not start the warehouse; warehouse compute starts only when RDI reads rows from the stream.\n\n**Redis memory concerns**\n\n- Set `streamLimit` to cap stream length\n- Use `count` to limit records per poll batch\n\n**Initial snapshot too slow**\n\n- Use `snapshot: \"NEVER\"` to skip initial snapshot\n- Pre-load data using other methods if needed"
    },
    {
      "id": "enable-debug-logging",
      "title": "Enable debug logging",
      "role": "content",
      "text": "Enable debug logging in the source configuration:\n\n[code example]\n\nView collector logs:\n\n[code example]"
    },
    {
      "id": "5-configuration-is-complete",
      "title": "5. Configuration is complete",
      "role": "content",
      "text": "Once you have followed the steps above, your Snowflake database is ready for RDI to use."
    },
    {
      "id": "see-also",
      "title": "See also",
      "role": "related",
      "text": "- [Snowflake Streams Documentation](https://docs.snowflake.com/en/user-guide/streams)\n- [Snowflake Key Pair Authentication](https://docs.snowflake.com/en/user-guide/key-pair-auth)\n- [Snowflake MFA rollout documentation](https://docs.snowflake.com/en/user-guide/security-mfa-rollout)\n- [RDI Deployment Guide](https://redis.io/docs/latest/integrate/redis-data-integration/data-pipelines/deploy)"
    }
  ],
  "examples": [
    {
      "id": "setup-ex0",
      "language": "checklist {id=\"snowflakelist\"}",
      "code": "- [ ] [Set up Snowflake permissions](#1-set-up-snowflake-permissions)\n- [ ] [Configure authentication](#2-configure-authentication)\n- [ ] [Set up secrets for Kubernetes deployment](#3-set-up-secrets-for-kubernetes-deployment)\n- [ ] [Configure RDI for Snowflake](#4-configure-rdi-for-snowflake)",
      "section_id": "setup"
    },
    {
      "id": "1-set-up-snowflake-permissions-ex0",
      "language": "sql",
      "code": "ALTER TABLE MYDB.PUBLIC.customers SET CHANGE_TRACKING = TRUE;\nALTER TABLE MYDB.PUBLIC.orders SET CHANGE_TRACKING = TRUE;",
      "section_id": "1-set-up-snowflake-permissions"
    },
    {
      "id": "1-set-up-snowflake-permissions-ex1",
      "language": "sql",
      "code": "-- Grant usage on the warehouse\nGRANT USAGE, OPERATE ON WAREHOUSE COMPUTE_WH TO ROLE rdi_role;\n\n-- Grant usage on the source database and schema\nGRANT USAGE ON DATABASE MYDB TO ROLE rdi_role;\nGRANT USAGE ON SCHEMA MYDB.PUBLIC TO ROLE rdi_role;\n\n-- Grant SELECT on tables to capture\nGRANT SELECT ON TABLE MYDB.PUBLIC.customers TO ROLE rdi_role;\nGRANT SELECT ON TABLE MYDB.PUBLIC.orders TO ROLE rdi_role;\n\n-- Grant permissions on the schema RDI uses for CDC objects\nGRANT USAGE ON SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;\nGRANT CREATE STREAM, CREATE TABLE ON SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;\n\n-- Assign the role to your RDI user\nGRANT ROLE rdi_role TO USER rdi_user;",
      "section_id": "1-set-up-snowflake-permissions"
    },
    {
      "id": "1-set-up-snowflake-permissions-ex2",
      "language": "sql",
      "code": "GRANT SELECT ON FUTURE TABLES IN SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;\nGRANT SELECT ON FUTURE STREAMS IN SCHEMA MYDB.RDI_CDC TO ROLE rdi_role;",
      "section_id": "1-set-up-snowflake-permissions"
    },
    {
      "id": "private-key-authentication-ex0",
      "language": "bash",
      "code": "openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt",
      "section_id": "private-key-authentication"
    },
    {
      "id": "private-key-authentication-ex1",
      "language": "bash",
      "code": "openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub",
      "section_id": "private-key-authentication"
    },
    {
      "id": "private-key-authentication-ex2",
      "language": "sql",
      "code": "ALTER USER rdi_user SET RSA_PUBLIC_KEY='<public_key_content>';",
      "section_id": "private-key-authentication"
    },
    {
      "id": "password-authentication-ex0",
      "language": "bash",
      "code": "kubectl create secret generic source-db \\\n  --namespace=rdi \\\n  --from-literal=SOURCE_DB_USERNAME=your_username \\\n  --from-literal=SOURCE_DB_PASSWORD=your_password",
      "section_id": "password-authentication"
    },
    {
      "id": "private-key-authentication-ex0",
      "language": "bash",
      "code": "kubectl create secret generic source-db-ssl \\\n  --namespace=rdi \\\n  --from-file=client.key=/path/to/rsa_key.p8",
      "section_id": "private-key-authentication"
    },
    {
      "id": "private-key-authentication-ex1",
      "language": "bash",
      "code": "kubectl create secret generic source-db \\\n  --namespace=rdi \\\n  --from-literal=SOURCE_DB_USERNAME=your_username",
      "section_id": "private-key-authentication"
    },
    {
      "id": "4-configure-rdi-for-snowflake-ex0",
      "language": "yaml",
      "code": "sources:\n  snowflake:\n    type: riotx\n    connection:\n      type: snowflake\n      url: \"jdbc:snowflake://myaccount.snowflakecomputing.com/\"\n      user: \"${SOURCE_DB_USERNAME}\"\n      password: \"${SOURCE_DB_PASSWORD}\"  # Omit for key-pair auth\n      database: \"MYDB\"\n      warehouse: \"COMPUTE_WH\"\n      # role: \"RDI_ROLE\"                 # Optional: Snowflake role\n      # cdcDatabase: \"CDC_DB\"            # Optional: Separate database for CDC streams\n      # cdcSchema: \"CDC_SCHEMA\"          # Optional: Separate schema for CDC streams\n    schemas:\n      - PUBLIC\n    tables:\n      PUBLIC.customers: {}\n      PUBLIC.orders: {}\n    advanced:\n      riotx:\n        poll: \"30s\"\n        snapshot: \"INITIAL\"              # Or \"NEVER\" to skip initial snapshot\n        # streamPrefix: \"data:\"          # Optional: Redis stream prefix\n        # streamLimit: 100000            # Optional: Max stream length\n        # keyColumns:                    # Recommended: stable key columns\n        #   - \"id\"\n        # clearOffset: false             # Optional: Clear offset on start\n\ntargets:\n  target:\n    connection:\n      type: redis\n      host: ${TARGET_DB_HOST}\n      port: ${TARGET_DB_PORT}\n      user: ${TARGET_DB_USERNAME}\n      password: ${TARGET_DB_PASSWORD}\n\nprocessors:\n  target_data_type: json",
      "section_id": "4-configure-rdi-for-snowflake"
    },
    {
      "id": "cdc-issues-ex0",
      "language": "sql",
      "code": "SHOW STREAMS IN SCHEMA my_cdc_database.my_cdc_schema;",
      "section_id": "cdc-issues"
    },
    {
      "id": "cdc-issues-ex1",
      "language": "bash",
      "code": "kubectl get deployments -n rdi | grep riotx-collector\n    kubectl logs -n rdi deployment/<riotx-collector-deployment>",
      "section_id": "cdc-issues"
    },
    {
      "id": "enable-debug-logging-ex0",
      "language": "yaml",
      "code": "sources:\n  snowflake:\n    type: riotx\n    logging:\n      level: debug\n    # ... rest of configuration",
      "section_id": "enable-debug-logging"
    },
    {
      "id": "enable-debug-logging-ex1",
      "language": "bash",
      "code": "kubectl get deployments -n rdi | grep riotx-collector\nkubectl logs -n rdi deployment/<riotx-collector-deployment> -f",
      "section_id": "enable-debug-logging"
    }
  ]
}
