How does endpoint failover happen in case of node failure?

Last updated 22, Mar 2024

Question

How does endpoint failover happen in case of node failure?

Answer

Endpoint failover when the policy is all-master shards versus all-nodes: When a node completely dies while the all-master-shards policy is in place - the master shard will failover, the endpoint will be removed, inaccessible on the dead node, and not published via DNS, so the client's queries to DNS will not find that endpoint. The same behavior will take place for the all-nodes policy. There are some differences between a cloud profile and a local network profile.

Cloud

  • the proxy process will be failed over after 3 failed checks (each check will be considered a failure if there is no response in 15 ms)
  • master shard will be failed over after 4 failed checks ( each check will be considered a failure if there is no response in 15 ms)
  • slave shard will be declared dead if it fails to respond 4 times ( each checkup will be viewed as a failure if no response in 20 ms)

Local network

  • the proxy process will be failed over after 2 failed checks ( each check will be considered a failure if there is no response in 10 ms)
  • master shard will be failed over after 2 failed checks ( each check will be considered a failure if there is no response in 5 ms)
  • slave shard will be declared dead if it fails to respond 2 times ( each checkup will be viewed as a failure if no response in 5 ms)