How Long Does it take to a Recover a Large Database From Persistence (RDB/AOF)?

Last updated 18, Apr 2024

Question

How long does it take to recover a big (multi TB) cache from persistence (RDB/AOF) in case of a major crash (the whole region or a DC is gone)?

Answer

If an entire DC is unavailable, the Redis Enterprise deployment recovery (installing and configuring the cluster) is likely a tiny part of the overall RTO. But setting that aside, Redis Enterprise is a distributed database, so the aggregate size of the data is not the only relevant factor to consider. Each shard would recover from its own RDB and/or AOF in parallel. How long that takes depends on a bunch of factors:

  • Primarily, what data structures are used and what aof-rewrite configs were predefined.
  • Another factor to consider is if it's a cluster using auto tiering, the shards might be twice the size and heavily influence the recovery time.
  • In some cases, restoring from AOF can be significantly slower than RDB (in the order of hours instead of minutes)
  • Performance of storage devices
  • Whether the volume is locally attached or remote (if it is remote, the network is a factor)

References