{
  "id": "redis-py",
  "title": "Redis feature store with redis-py",
  "url": "https://redis.io/docs/latest/develop/use-cases/feature-store/redis-py/",
  "summary": "Build a Redis-backed online feature store in Python with redis-py",
  "tags": [
    "docs",
    "develop",
    "stack",
    "oss",
    "rs",
    "rc"
  ],
  "last_updated": "2026-06-04T14:49:57+01:00",
  "children": [],
  "page_type": "content",
  "content_hash": "e52b87acb2383bf11752c435dc38a3cf3738eb9e1e19ac17312be493e3703eb3",
  "sections": [
    {
      "id": "overview",
      "title": "Overview",
      "role": "overview",
      "text": "This guide shows you how to build a small Redis-backed online feature store in\nPython with [`redis-py`](https://redis.io/docs/latest/develop/clients/redis-py). It\nincludes a local web server built with the Python standard library so you can\nbulk-load a batch of users with a key-level TTL, run a streaming worker that\noverwrites real-time features with per-field TTL, retrieve any subset of\nfeatures for one user under 1 ms, and pipeline `HMGET` across a hundred users\nfor batch scoring."
    },
    {
      "id": "overview",
      "title": "Overview",
      "role": "overview",
      "text": "Each entity (here, a user) is one Redis [Hash](https://redis.io/docs/latest/develop/data-types/hashes)\nat a deterministic key — `fs:user:{id}`. The hash holds every feature for that\nentity as one field per feature: batch-materialized aggregates (refreshed once\na day) alongside streaming-updated signals (refreshed every few seconds). One\n[`HMGET`](https://redis.io/docs/latest/commands/hmget) returns whichever subset the model\nneeds in one network round trip.\n\nTwo TTL layers solve the *mixed staleness* problem without an application-side\ncleaner:\n\n* A **key-level** [`EXPIRE`](https://redis.io/docs/latest/commands/expire) aligned with the\n  batch materialization cycle (24 hours in the demo). If the batch refresher\n  fails, the whole entity disappears at the next cycle and inference sees a\n  missing entity — which the model handler can detect and fall back on —\n  rather than silently outdated values.\n* A **per-field** [`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire) (Redis 7.4+) on\n  each streaming feature gives that field its own shorter expiry, independent\n  of the rest of the hash. If the streaming pipeline stops updating a feature,\n  the field self-cleans while the batch fields stay populated.\n\nIn this example, the batch features describe a user's longer-term shape\n(`country_iso`, `risk_segment`, `account_age_days`, `tx_count_7d`,\n`avg_amount_30d`, `chargeback_count_180d`) and are bulk-loaded by\n`build_features.py` — the demo's stand-in for a nightly Spark / Feast\nmaterialization job. The streaming features describe what the user is doing\nright now (`last_login_ts`, `last_device_id`, `tx_count_5m`,\n`failed_logins_15m`, `session_country`) and are written by\n`streaming_worker.py` — the demo's stand-in for a Flink / Kafka Streams job.\nThe inference panel of the demo server reads any subset of those features\nthrough `feature_store.py`'s helper class.\n\nThat gives you:\n\n* A single round trip for retrieval — any subset of features for one entity\n  in one [`HMGET`](https://redis.io/docs/latest/commands/hmget).\n* Sub-millisecond hot path. The Redis-side work is microseconds; in practice\n  the bottleneck is the network round trip plus the model's own feature-prep.\n* Pipelined batch scoring — one round trip for `N` users at once.\n* Independent freshness per feature, expressed as a server-side TTL rather\n  than as application logic.\n* Self-cleanup on pipeline failure: a stalled batch refresher lets entities\n  expire on schedule, and a stalled streaming worker lets each affected field\n  expire on its own timer."
    },
    {
      "id": "how-it-works",
      "title": "How it works",
      "role": "content",
      "text": "There are three paths: a **batch path** that bulk-loads features once per\nmaterialization cycle, a **streaming path** that updates real-time features as\nevents arrive, and an **inference path** that reads features on the request\nside."
    },
    {
      "id": "batch-path-per-materialization-cycle",
      "title": "Batch path (per materialization cycle)",
      "role": "content",
      "text": "1. The batch job calls `synthesize_users(N)` (in production, the equivalent\n   computation lives in an offline pipeline against the warehouse). The result\n   is `{user_id: {field: value, ...}}` for every user in this cycle.\n2. `store.bulk_load(rows, ttl_seconds=86400)` pipelines one\n   [`HSET`](https://redis.io/docs/latest/commands/hset) plus one\n   [`EXPIRE`](https://redis.io/docs/latest/commands/expire) per user into a single\n   round trip. The `HSET` writes every batch field; the `EXPIRE` is what makes\n   the entity disappear if the next batch run fails, so inference reads a\n   missing entity rather than silently outdated values."
    },
    {
      "id": "streaming-path-per-event",
      "title": "Streaming path (per event)",
      "role": "content",
      "text": "When a user does something (login, transaction, page view) the streaming layer\ncomputes whatever real-time signals fall out of that event and calls\n`store.update_streaming(user_id, fields, ttl_seconds=300)`. That pipelines:\n\n1. An [`HSET`](https://redis.io/docs/latest/commands/hset) writing the new field values.\n   Redis is single-threaded per shard, so this is atomic against any\n   concurrent batch write on the same hash — no version columns, no locks.\n2. An [`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire) over exactly the fields\n   that were written, with the streaming TTL. Each streaming field carries\n   its own per-field expiry independent of the rest of the hash. Stop the\n   worker and these fields drop out one by one as their TTLs elapse, while\n   the batch fields remain populated under the longer key-level TTL."
    },
    {
      "id": "inference-path-per-request",
      "title": "Inference path (per request)",
      "role": "content",
      "text": "1. The model server picks the feature subset it needs (the schema is owned by\n   the model, not the store).\n2. It calls `store.get_features(user_id, names)`, which is one\n   [`HMGET`](https://redis.io/docs/latest/commands/hmget). Redis returns the values in\n   the same order as the requested fields, with `None` for any field that\n   doesn't exist (or has expired).\n3. For batch inference, the model server calls\n   `store.batch_get_features(user_ids, names)`, which pipelines one\n   [`HMGET`](https://redis.io/docs/latest/commands/hmget) per user across all `N` users\n   in a single network round trip."
    },
    {
      "id": "the-feature-store-helper",
      "title": "The feature-store helper",
      "role": "content",
      "text": "The `RedisFeatureStore` class wraps the read/write paths\n([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/feature-store/redis-py/feature_store.py)):\n\n[code example]"
    },
    {
      "id": "data-model",
      "title": "Data model",
      "role": "content",
      "text": "Each user is one Redis Hash. Every value is stored as a string — Redis hash\nfields are bytes-on-the-wire, so the helper encodes booleans as `\"true\"` /\n`\"false\"` and numbers as their `str(...)` form. The model server is responsible\nfor parsing back to the right type, the same way it would when reading any\nserialized feature store.\n\n[code example]\n\nThe batch fields sit under the key-level `EXPIRE`. The streaming fields each\ncarry their own [`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire). If the\nstreaming pipeline stops, the streaming fields drop one by one as their\nper-field TTLs elapse; the batch fields stay until the daily key-level\n`EXPIRE` fires (or the next batch cycle re-pins them)."
    },
    {
      "id": "bulk-loading-batch-features",
      "title": "Bulk-loading batch features",
      "role": "content",
      "text": "`bulk_load` pipelines one `HSET` and one `EXPIRE` per user into a single round\ntrip. With 500 users that's 1000 commands in one network call — Redis processes\nthem sequentially on the server side but the client only pays one RTT.\n\n[code example]\n\n`transaction=False` skips the `MULTI/EXEC` wrapper that\n[`pipeline`](https://redis.io/docs/latest/develop/clients/redis-py/transpipe) defaults to\n— commands still queue and ship in one round trip, but they execute as\nindependent commands rather than as one atomic block. That is the right\nchoice here: each user's `HSET` + `EXPIRE` pair is independent of every\nother user's, and an all-or-nothing transaction would block the server for\nthe duration of the batch. It does *not* make the `HSET` + `EXPIRE` pair\natomic — in the extremely unlikely event the server crashes between the two,\nthe entity exists without a key-level TTL until the next batch run re-pins\nit. For an ingestion script that runs end-to-end every cycle this is fine;\nif you need the pair to be inseparable, wrap each user in its own tiny\n`MULTI/EXEC` or a Lua script (see [`EVAL`](https://redis.io/docs/latest/commands/eval) /\n[Eval scripting](https://redis.io/docs/latest/develop/programmability/eval-intro)).\n\nIn production, the equivalent of this script runs as an offline pipeline (a\nSpark or Feast `materialize` job) that reads from the warehouse and writes\ninto Redis. The\n[Feast `RedisOnlineStore`](https://docs.feast.dev/reference/online-stores/redis)\nprovider does exactly this under the hood."
    },
    {
      "id": "streaming-writes-with-per-field-ttl",
      "title": "Streaming writes with per-field TTL",
      "role": "content",
      "text": "`update_streaming` is the linchpin of the mixed-staleness story:\n\n[code example]\n\n[`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire) sets the TTL on *individual*\nhash fields, not on the whole key. The two commands are sent in one round trip\nand Redis executes them in pipeline order: the `HSET` runs first and creates\nor overwrites the fields, then `HEXPIRE` attaches a TTL to each of those same\nfields. `HEXPIRE` returns one status code per field — `1` if the TTL was set,\n`-2` if the field doesn't exist — so the helper raises if any code is anything\nother than `1`. That makes the \"every streaming write renews its TTL\"\ninvariant fail loudly rather than silently leaving a streaming field with no\nexpiry attached.\n\nIf a streaming pipeline stops, the streaming fields drop out one by one as\ntheir per-field TTLs elapse — there is no application-side cleaner involved.\n[`HTTL`](https://redis.io/docs/latest/commands/httl) lets the model side inspect the\nremaining TTL on any field, which is useful both for debugging (\"why is this\nfeature missing?\" → \"it expired three seconds ago\") and as a freshness signal\nin the model itself.\n\n> **HEXPIRE requires Redis 7.4 or later.** `HEXPIRE` and the field-level TTL\n> commands (`HTTL`, `HPERSIST`, `HEXPIREAT`, `HPEXPIRE`, `HPEXPIREAT`,\n> `HPTTL`, `HEXPIRETIME`, `HPEXPIRETIME`) were added in Redis 7.4. On older\n> Redis builds you would have to put streaming features on their own keys\n> (one key per feature, or one key per feature group) and set a key-level\n> `EXPIRE` instead — at the cost of giving up the single-`HMGET` retrieval."
    },
    {
      "id": "inference-reads-with-hmget",
      "title": "Inference reads with HMGET",
      "role": "content",
      "text": "`get_features` is one `HMGET`:\n\n[code example]\n\nThe model knows exactly which features it consumes, so the request path always\ntakes the `HMGET` branch with an explicit field list — that's the\nsub-millisecond path. `HGETALL` is the right call for debugging (which is what\nthe demo's \"Inspect\" panel does) but not for serving: it forces Redis to\nserialize every field, including ones the model doesn't need.\n\nFields that don't exist (because they were never written, or because they\nexpired) come back as `None`. The helper drops them from the result dict so\nthe caller sees only the features that are actually available. A real model\nserver would either treat missing values as a feature (\"this user has no\nstreaming signal yet\") or fall back to a default from the model's training\ndata."
    },
    {
      "id": "batch-scoring-with-pipelined-hmget",
      "title": "Batch scoring with pipelined HMGET",
      "role": "content",
      "text": "For batch inference, the same `HMGET` shape pipelines across users:\n\n[code example]\n\nOne round trip for the whole batch — the demo regularly returns 100 users in\n2-3 ms against a local Redis. On a real network the round trip dominates;\npipelining is what keeps batch scoring practical.\n\nFor very large batches on a clustered deployment, the same shape generalizes\nto one pipeline per shard: bucket the entity IDs by their hash slot\n(`cluster.keyslot(key)`), then issue one pipeline against each shard in\nparallel. `redis-py`'s\n[`RedisCluster` pipeline](https://redis-py.readthedocs.io/en/stable/clustering.html#redis-cluster-pipeline)\nhandles that automatically — the per-user `HMGET` calls are dispatched to the\nright shard transparently."
    },
    {
      "id": "the-streaming-worker",
      "title": "The streaming worker",
      "role": "content",
      "text": "`streaming_worker.py` is the demo's stand-in for whatever Flink, Kafka Streams,\nor bespoke service computes the real-time features\n([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/feature-store/redis-py/streaming_worker.py)).\nIt runs as a daemon thread next to the demo server so the UI can start, pause,\nand resume it; in production this code would live in the streaming layer.\n\nEvery tick the worker picks a few random users, generates a new value for each\nstreaming feature, and calls `store.update_streaming(user_id, fields)`. The\ndemo defaults to 5 users per tick at 1-second intervals — so a 200-user store\nsees roughly half its users refreshed in the first minute, and most after a\nfew minutes. Drop `--seed-users` or raise `--users-per-tick` if you'd rather\nhave every user touched quickly.\n\n[code example]\n\nPausing the worker is what shows off the mixed-staleness behavior: leave it\npaused for longer than `streaming_ttl_seconds` and the streaming fields\ndisappear from every user's hash one by one, while the batch fields remain\nunder the longer key-level `EXPIRE`. The demo's `Pause / resume` button lets\nyou see this happen in real time."
    },
    {
      "id": "the-batch-builder",
      "title": "The batch builder",
      "role": "content",
      "text": "`build_features.py` is the demo's nightly materializer\n([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/feature-store/redis-py/build_features.py)).\nIt generates synthetic feature rows and calls `store.bulk_load` once. The\nsynthesis itself is not the point — in a real deployment the equivalent code\nreads from the offline store (Snowflake, BigQuery, Iceberg) and writes the\nresulting hashes into Redis.\n\n[code example]\n\nYou can run the builder on its own (independently of the demo server) to\npopulate Redis from the command line:\n\n[code example]\n\nThat writes 500 users at `fs:user:*` with a one-hour key-level TTL, which is\nhow a typical operator would pre-seed a feature store from the command line\nwhen debugging."
    },
    {
      "id": "the-interactive-demo",
      "title": "The interactive demo",
      "role": "content",
      "text": "`demo_server.py` runs a `ThreadingHTTPServer` on port 8085. The HTML page lets\nyou:\n\n* **Bulk-load** any number of users (default 200) with a configurable\n  key-level TTL. Drop the TTL to 30 s and watch the entire store expire on\n  schedule — the same thing that happens if a daily refresher fails.\n* See the **store state** at a glance: user count, batch / streaming TTLs,\n  cumulative read/write counters.\n* See the **streaming worker** status (running / paused, ticks completed,\n  writes performed) and **pause or resume** it. Leave it paused for longer\n  than the streaming TTL to watch streaming fields drop out.\n* Run an **inference read** for any user with a chosen feature subset, and\n  see the value, the per-field TTL, and the read latency.\n* Run **batch scoring** with a pipelined `HMGET` across `N` users and see\n  the total elapsed time plus the per-user breakdown.\n* **Inspect** any user's full hash with field-level TTLs and the key-level\n  TTL — the right view for debugging \"why is this feature missing?\" at\n  read time.\n\nThe server holds one `RedisFeatureStore` instance and one `StreamingWorker`\nfor the lifetime of the process. Endpoints:\n\n| Endpoint                  | What it does                                                                        |\n|---------------------------|-------------------------------------------------------------------------------------|\n| `GET  /state`             | User count, TTL config, stats counters, worker status.                              |\n| `POST /bulk-load`         | Pipelined `HSET` + `EXPIRE` over N synthetic users with a chosen TTL.               |\n| `POST /worker/toggle`     | Pause / resume the streaming worker.                                                |\n| `POST /read`              | `HMGET` a chosen feature subset for one user; report latency and per-field TTLs.    |\n| `POST /batch-read`        | Pipeline `HMGET` across N users; report total latency and per-entity field counts.  |\n| `GET  /inspect`           | `HGETALL` + `HTTL` for one user; full hash view with per-field TTLs.                |\n| `POST /reset`             | Drop every user under the key prefix (used by the demo's reset button).             |"
    },
    {
      "id": "prerequisites",
      "title": "Prerequisites",
      "role": "content",
      "text": "* **Redis 7.4 or later.** [`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire) and\n  [`HTTL`](https://redis.io/docs/latest/commands/httl) were added in Redis 7.4; the demo\n  relies on per-field TTL for the mixed-staleness story.\n* **Python 3.9 or later.**\n* The `redis-py` client. Install it with:\n\n  [code example]\n\n  Field-level TTL commands (`hexpire`, `httl`) were added to `redis-py` in 5.1.\n\nIf your Redis server is running elsewhere, start the demo with `--redis-host`\nand `--redis-port`."
    },
    {
      "id": "running-the-demo",
      "title": "Running the demo",
      "role": "content",
      "text": ""
    },
    {
      "id": "get-the-source-files",
      "title": "Get the source files",
      "role": "content",
      "text": "The demo consists of four Python files. Download them from the\n[`redis-py` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/feature-store/redis-py)\non GitHub, or grab them with `curl`:\n\n[code example]"
    },
    {
      "id": "start-the-demo-server",
      "title": "Start the demo server",
      "role": "content",
      "text": "From that directory:\n\n[code example]\n\nYou should see:\n\n[code example]\n\nBy default the demo wipes the configured key prefix on startup so each run\nstarts from a clean state. Pass `--no-reset` to keep any existing data, or\n`--key-prefix <prefix>` to point the demo at a different prefix entirely.\n\nOpen [http://127.0.0.1:8085](http://127.0.0.1:8085) in a browser. Useful things\nto try:\n\n* Pick a user and click **Read features** with a mixed batch/streaming subset\n  — you'll see batch fields with no per-field TTL (covered by the key-level\n  TTL) and streaming fields with a positive per-field TTL.\n* Click **Pipeline HMGET** with `count=100` to see the latency of a 100-user\n  batch read.\n* Click **Pause / resume** on the streaming worker and leave it paused for\n  ~5 minutes (or restart the server with `--streaming-ttl-seconds 30` to\n  make it visible in seconds). Re-run **Read features** on any user and\n  watch the streaming fields disappear while the batch fields stay.\n* Click **Inspect** on a user to see the full hash with field-level TTLs.\n* Click **Bulk-load** with a short TTL (say 30 seconds) and watch the user\n  count fall to zero on the next minute — the same thing that happens if a\n  daily batch run fails to land.\n* Click **Reset** to drop every user and start over.\n\nThe server is read/write against your local Redis. The default key prefix is\n`fs:user:`. Pass `--no-reset` to keep existing data across restarts, or\n`--redis-host` / `--redis-port` to point at a different Redis."
    },
    {
      "id": "production-usage",
      "title": "Production usage",
      "role": "content",
      "text": "The guidance below focuses on the production concerns that are specific to\nrunning a feature store on Redis. For the generic redis-py production checklist\n— connection-pool sizing,\n[TLS and AUTH](https://redis.io/docs/latest/develop/clients/redis-py/connect#connect-to-your-production-redis-with-tls),\n[exception handling](https://redis.io/docs/latest/develop/clients/redis-py/produsage#exception-handling),\nand the rest — see the\n[redis-py production usage guide](https://redis.io/docs/latest/develop/clients/redis-py/produsage).\nThe feature-store demo runs against `localhost` with the defaults; a real\ndeployment should harden the client first."
    },
    {
      "id": "pick-the-batch-ttl-to-outlast-a-failed-refresher",
      "title": "Pick the batch TTL to outlast a failed refresher",
      "role": "content",
      "text": "The whole-entity `EXPIRE` is your safety net against silent staleness from a\nbroken batch pipeline. Set it longer than your worst-case batch outage so a\nsingle missed run doesn't take the feature store offline, but short enough\nthat a sustained outage causes loud failures (missing entities) rather than\nquiet ones (yesterday's features being scored as today's). The standard\nchoice is one cycle of \"expected refresh interval × 2\" — for a daily batch,\n48 hours; for a 6-hour batch, 12 hours.\n\nThe same logic applies to the per-field streaming TTL: a few times the\nexpected update interval so a slow-but-alive streaming worker doesn't churn\nfeatures needlessly, but short enough that a stalled worker causes visible\nfreshness failures."
    },
    {
      "id": "co-locate-the-online-store-with-serving-not-with-training",
      "title": "Co-locate the online store with serving, not with training",
      "role": "content",
      "text": "The online store's hash representation does *not* have to match the schema in\nyour offline store. The batch materialization step is your chance to flatten\njoins, encode categoricals, and project to whatever shape the model server\nwants — so the request path is exactly one `HMGET` and zero transforms.\n\nThe training pipeline reads from the offline store with its own schema; the\nserving pipeline reads from Redis with the flattened serving schema. Keeping\nthose two pipelines as the same code path is what prevents training-serving\nskew."
    },
    {
      "id": "pipeline-batch-reads-across-shards",
      "title": "Pipeline batch reads across shards",
      "role": "content",
      "text": "On a single Redis instance, pipelining `HMGET` across `N` users is one round\ntrip. On a Redis Cluster, the keys land on different shards — `redis-py`'s\n[`RedisCluster` client](https://redis-py.readthedocs.io/en/stable/clustering.html)\ndispatches each `HMGET` to the right shard transparently, but you still pay\none round trip per shard rather than one for the whole batch. For very\nlatency-sensitive batch inference, group users by shard slot\n(`cluster.keyslot(key)`) and issue one pipeline per shard in parallel.\n\nFor a small number of frequently-queried users (a top-N customer list, for\nexample), a hash tag like `fs:user:{vip}:u0001` forces the keys onto the same\nshard and lets one pipeline serve them all in one round trip."
    },
    {
      "id": "make-hexpire-part-of-every-streaming-write",
      "title": "Make HEXPIRE part of every streaming write",
      "role": "content",
      "text": "The single biggest correctness lever in this design is that the streaming\nwrite applies `HEXPIRE` *every time*. If a streaming worker writes a field\nwithout renewing its TTL, the field carries whatever expiry was there before\n— possibly none, possibly stale — and the mixed-staleness invariant breaks.\nKeep the `HSET` and `HEXPIRE` in the same pipeline (or, even safer, in the\nsame [Lua script](https://redis.io/docs/latest/develop/programmability/eval-intro) if\nyou don't trust the call site)."
    },
    {
      "id": "avoid-hgetall-on-the-request-path",
      "title": "Avoid HGETALL on the request path",
      "role": "content",
      "text": "`HGETALL` reads every field on the hash, including ones the model doesn't\nneed. With dozens of features per entity, that is wasted serialization work\non the server and wasted bandwidth on the wire. Always specify the field list\nexplicitly with `HMGET` in the model server.\n\nThe exception is debugging and feature-set discovery, where you genuinely\nwant the full hash. The demo's \"Inspect\" button uses `HGETALL` for exactly\nthis reason."
    },
    {
      "id": "inspect-the-store-directly-with-redis-cli",
      "title": "Inspect the store directly with redis-cli",
      "role": "content",
      "text": "When testing or troubleshooting, the cli tells you everything:\n\n[code example]\n\nA streaming field that returns `-2` from `HTTL` doesn't exist on the hash\n(either it was never written, or it expired); `-1` means the field has no\nTTL set (and is therefore covered only by the key-level `EXPIRE`); any\npositive value is the remaining TTL in seconds."
    },
    {
      "id": "learn-more",
      "title": "Learn more",
      "role": "related",
      "text": "This example uses the following Redis commands:\n\n* [`HSET`](https://redis.io/docs/latest/commands/hset) to write a feature or a whole\n  feature row in one call.\n* [`HMGET`](https://redis.io/docs/latest/commands/hmget) to retrieve any subset of\n  features for one entity in one round trip.\n* [`HGETALL`](https://redis.io/docs/latest/commands/hgetall) for debugging and\n  feature-set discovery.\n* [`HEXPIRE`](https://redis.io/docs/latest/commands/hexpire) and\n  [`HTTL`](https://redis.io/docs/latest/commands/httl) for per-field TTL on streaming\n  features (Redis 7.4+).\n* [`EXPIRE`](https://redis.io/docs/latest/commands/expire) and\n  [`TTL`](https://redis.io/docs/latest/commands/ttl) for the whole-entity TTL aligned\n  with the batch materialization cycle.\n* Pipelined `HMGET` across many entities for batch scoring with one network\n  round trip — see [Pipelining](https://redis.io/docs/latest/develop/using-commands/pipelining).\n\nSee the [`redis-py` documentation](https://redis.io/docs/latest/develop/clients/redis-py)\nfor the full client reference, and the\n[Hashes overview](https://redis.io/docs/latest/develop/data-types/hashes) for the deeper\nconceptual model — including the listpack encoding that makes small hashes\nparticularly compact in memory, which matters at feature-store scale."
    }
  ],
  "examples": [
    {
      "id": "the-feature-store-helper-ex0",
      "language": "python",
      "code": "import redis\nfrom feature_store import RedisFeatureStore\n\nr = redis.Redis(host=\"localhost\", port=6379, decode_responses=True)\nstore = RedisFeatureStore(\n    redis_client=r,\n    key_prefix=\"fs:user:\",\n    batch_ttl_seconds=24 * 60 * 60,    # whole-entity TTL aligned with the daily batch cycle\n    streaming_ttl_seconds=5 * 60,      # per-field TTL on each streaming feature\n)\n\n# Batch materialization: one HSET + EXPIRE per user, all pipelined.\nstore.bulk_load({\n    \"u0001\": {\"country_iso\": \"US\", \"risk_segment\": \"low\",\n              \"tx_count_7d\": 14, \"avg_amount_30d\": 92.40,\n              \"account_age_days\": 612, \"chargeback_count_180d\": 0},\n    \"u0002\": {\"country_iso\": \"GB\", \"risk_segment\": \"medium\",\n              \"tx_count_7d\": 47, \"avg_amount_30d\": 220.10,\n              \"account_age_days\": 1840, \"chargeback_count_180d\": 1},\n})\n\n# Streaming write: HSET + HEXPIRE on just the fields that changed.\nstore.update_streaming(\"u0001\", {\n    \"last_login_ts\": 1716998413541,\n    \"last_device_id\": \"ios-9f02\",\n    \"tx_count_5m\": 3,\n    \"failed_logins_15m\": 0,\n    \"session_country\": \"US\",\n})\n\n# Inference read: HMGET of whatever the model needs.\nfeatures = store.get_features(\"u0001\", [\n    \"risk_segment\", \"tx_count_7d\", \"avg_amount_30d\",\n    \"tx_count_5m\", \"failed_logins_15m\",\n])\n\n# Batch scoring: pipelined HMGET across many users.\nbatch = store.batch_get_features(\n    user_ids=[\"u0001\", \"u0002\", \"u0003\"],\n    field_names=[\"risk_segment\", \"tx_count_5m\", \"failed_logins_15m\"],\n)",
      "section_id": "the-feature-store-helper"
    },
    {
      "id": "data-model-ex0",
      "language": "text",
      "code": "fs:user:u0001                                   TTL = 86400 s (key-level)\n  country_iso=US                                <no field TTL>\n  risk_segment=low                              <no field TTL>\n  account_age_days=612                          <no field TTL>\n  tx_count_7d=14                                <no field TTL>\n  avg_amount_30d=92.40                          <no field TTL>\n  chargeback_count_180d=0                       <no field TTL>\n  last_login_ts=1716998413541                   TTL = 300 s (per field, HEXPIRE)\n  last_device_id=ios-9f02                       TTL = 300 s (per field, HEXPIRE)\n  tx_count_5m=3                                 TTL = 300 s (per field, HEXPIRE)\n  failed_logins_15m=0                           TTL = 300 s (per field, HEXPIRE)\n  session_country=US                            TTL = 300 s (per field, HEXPIRE)",
      "section_id": "data-model"
    },
    {
      "id": "bulk-loading-batch-features-ex0",
      "language": "python",
      "code": "def bulk_load(\n    self,\n    rows: Mapping[str, FeatureMap],\n    ttl_seconds: Optional[int] = None,\n) -> int:\n    ttl = self.batch_ttl_seconds if ttl_seconds is None else ttl_seconds\n    pipe = self.redis.pipeline(transaction=False)\n    for entity_id, fields in rows.items():\n        key = self.key_for(entity_id)\n        pipe.hset(key, mapping={k: _encode(v) for k, v in fields.items()})\n        pipe.expire(key, ttl)\n    pipe.execute()\n    ...",
      "section_id": "bulk-loading-batch-features"
    },
    {
      "id": "streaming-writes-with-per-field-ttl-ex0",
      "language": "python",
      "code": "def update_streaming(\n    self,\n    entity_id: str,\n    fields: FeatureMap,\n    ttl_seconds: Optional[int] = None,\n) -> None:\n    if not fields:\n        return\n    ttl = self.streaming_ttl_seconds if ttl_seconds is None else ttl_seconds\n    key = self.key_for(entity_id)\n    encoded = {name: _encode(value) for name, value in fields.items()}\n\n    pipe = self.redis.pipeline(transaction=False)\n    pipe.hset(key, mapping=encoded)\n    pipe.hexpire(key, ttl, *encoded.keys())\n    pipe.execute()",
      "section_id": "streaming-writes-with-per-field-ttl"
    },
    {
      "id": "inference-reads-with-hmget-ex0",
      "language": "python",
      "code": "def get_features(\n    self,\n    entity_id: str,\n    field_names: Optional[Iterable[str]] = None,\n) -> dict[str, str]:\n    key = self.key_for(entity_id)\n    if field_names is None:\n        return self.redis.hgetall(key)\n    names = list(field_names)\n    if not names:\n        return {}\n    values = self.redis.hmget(key, names)\n    return {n: v for n, v in zip(names, values) if v is not None}",
      "section_id": "inference-reads-with-hmget"
    },
    {
      "id": "batch-scoring-with-pipelined-hmget-ex0",
      "language": "python",
      "code": "def batch_get_features(\n    self,\n    entity_ids: Iterable[str],\n    field_names: Iterable[str],\n) -> dict[str, dict[str, str]]:\n    ids = list(entity_ids)\n    names = list(field_names)\n    if not ids or not names:\n        return {}\n\n    pipe = self.redis.pipeline(transaction=False)\n    for entity_id in ids:\n        pipe.hmget(self.key_for(entity_id), names)\n    rows = pipe.execute()\n\n    out: dict[str, dict[str, str]] = {}\n    for entity_id, values in zip(ids, rows):\n        out[entity_id] = {n: v for n, v in zip(names, values) if v is not None}\n    return out",
      "section_id": "batch-scoring-with-pipelined-hmget"
    },
    {
      "id": "the-streaming-worker-ex0",
      "language": "python",
      "code": "def _tick(self) -> None:\n    ids = self.store.list_entity_ids(limit=500)\n    if not ids:\n        return\n    chosen = self._rng.sample(ids, k=min(self.users_per_tick, len(ids)))\n    now_ms = int(time.time() * 1000)\n    for entity_id in chosen:\n        fields = {\n            \"last_login_ts\": now_ms,\n            \"last_device_id\": self._rng.choice(DEVICE_IDS),\n            \"tx_count_5m\": self._rng.randint(0, 12),\n            \"failed_logins_15m\": self._rng.choices(\n                (0, 1, 2, 5), weights=(70, 20, 8, 2), k=1,\n            )[0],\n            \"session_country\": self._rng.choice(SESSION_COUNTRIES),\n        }\n        self.store.update_streaming(entity_id, fields)",
      "section_id": "the-streaming-worker"
    },
    {
      "id": "the-batch-builder-ex0",
      "language": "python",
      "code": "def synthesize_users(count: int, seed: int = 42) -> dict[str, dict]:\n    rng = random.Random(seed)\n    users: dict[str, dict] = {}\n    for i in range(1, count + 1):\n        uid = f\"u{i:04d}\"\n        users[uid] = {\n            \"country_iso\": rng.choice(COUNTRY_CHOICES),\n            \"risk_segment\": rng.choices(\n                RISK_SEGMENTS, weights=(70, 25, 5), k=1,\n            )[0],\n            \"account_age_days\": rng.randint(7, 2400),\n            \"tx_count_7d\": rng.randint(0, 80),\n            \"avg_amount_30d\": round(rng.uniform(5, 350), 2),\n            \"chargeback_count_180d\": rng.choices(\n                (0, 1, 2, 3), weights=(85, 10, 4, 1), k=1,\n            )[0],\n        }\n    return users",
      "section_id": "the-batch-builder"
    },
    {
      "id": "the-batch-builder-ex1",
      "language": "bash",
      "code": "python3 build_features.py --count 500 --ttl-seconds 3600",
      "section_id": "the-batch-builder"
    },
    {
      "id": "prerequisites-ex0",
      "language": "bash",
      "code": "pip install \"redis>=5.1\"",
      "section_id": "prerequisites"
    },
    {
      "id": "get-the-source-files-ex0",
      "language": "bash",
      "code": "mkdir feature-store-demo && cd feature-store-demo\nBASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/feature-store/redis-py\ncurl -O $BASE/feature_store.py\ncurl -O $BASE/build_features.py\ncurl -O $BASE/streaming_worker.py\ncurl -O $BASE/demo_server.py",
      "section_id": "get-the-source-files"
    },
    {
      "id": "start-the-demo-server-ex0",
      "language": "bash",
      "code": "python3 demo_server.py",
      "section_id": "start-the-demo-server"
    },
    {
      "id": "start-the-demo-server-ex1",
      "language": "text",
      "code": "Dropping any existing users under 'fs:user:*' for a clean demo run (pass --no-reset to keep them).\nRedis feature-store demo server listening on http://127.0.0.1:8085\nUsing Redis at localhost:6379 with key prefix 'fs:user:' (batch TTL 86400s, streaming TTL 300s)\nMaterialized 200 user(s); streaming worker running.",
      "section_id": "start-the-demo-server"
    },
    {
      "id": "inspect-the-store-directly-with-redis-cli-ex0",
      "language": "bash",
      "code": "# How many users currently in the store\nredis-cli --scan --pattern 'fs:user:*' | wc -l\n\n# One user's full hash and key-level TTL\nredis-cli HGETALL fs:user:u0001\nredis-cli TTL    fs:user:u0001\n\n# Per-field TTL on the streaming fields\nredis-cli HTTL fs:user:u0001 FIELDS 5 \\\n  last_login_ts last_device_id tx_count_5m failed_logins_15m session_country\n\n# Sample HMGET as the model would issue it\nredis-cli HMGET fs:user:u0001 risk_segment tx_count_7d avg_amount_30d tx_count_5m",
      "section_id": "inspect-the-store-directly-with-redis-cli"
    }
  ]
}
