diff --git a/commands/cluster-setslot.md b/commands/cluster-setslot.md index c41e21ae..85a8dda2 100644 --- a/commands/cluster-setslot.md +++ b/commands/cluster-setslot.md @@ -79,3 +79,14 @@ Notes: If the source node is informed before the destination node and the destination node crashes before it is set as new slot owner, the slot is left with no owner, even after a successful failover. * Step 6, sending `SETSLOT` to the nodes not involved in the resharding, is not technically necessary since the configuration will eventually propagate itself. However, it is a good idea to do so in order to stop nodes from pointing to the wrong node for the hash slot moved as soon as possible, resulting in less redirections to find the right node. +* Starting from Valkey 8.0, `CLUSTER SETSLOT` is synchronously replicated to all healthy replicas + running Valkey version 8.0+. By default, this synchronous replication must complete within 2 seconds. + If the replication fails, the primary does not execute the command, and the client receives a + `NOREPLICAS Not enough good replicas to write` error. Operators can retry the command or customize the + timeout using the `TIMEOUT` parameter to further increase the reliability of live reconfiguration: + + ``` + CLUSTER SETSLOT slot [MIGRATING|IMPORTING|NODE] node-id [TIMEOUT timeout] + ``` + + Here, `timeout` is measured in seconds, with 0 meaning to wait indefinitely. diff --git a/topics/cluster-spec.md b/topics/cluster-spec.md index b3cea314..96af83b9 100644 --- a/topics/cluster-spec.md +++ b/topics/cluster-spec.md @@ -403,7 +403,7 @@ without redirections, proxies or other single point of failure entities. A client **must be also able to handle -ASK redirections** that are described later in this document, otherwise it is not a complete Valkey Cluster client. -### Live reconfiguration +### Live resharding Valkey Cluster supports the ability to add and remove nodes while the cluster is running. Adding or removing a node is abstracted into the same @@ -513,6 +513,36 @@ set the slots to their normal state again. The same command is usually sent to all other nodes to avoid waiting for the natural propagation of the new configuration across the cluster. +#### Replication of `CLUSTER SETSLOT` + +Starting from Valkey 8.0, the `CLUSTER SETSLOT` command is replicated if the replicas are running Valkey version 8.0+. +The primary node waits up to 2 seconds, by default, for all healthy replicas to acknowledge the replication. +If not all health replicas acknowledge the replication within this time frame, the primary aborts the command, +and the client receives a `NOREPLICAS Not enough good replicas to write` error. +Operators can retry the command or customize the timeout using the `TIMEOUT` parameter to further increase the +reliability of live resharding: + + CLUSTER SETSLOT slot [MIGRATING|IMPORTING|NODE] node-id [TIMEOUT timeout] + +The `timeout` is specified in seconds, where a value of 0 indicates an indefinite wait time. + +Replicating the slot information and ensuring acknowledgement from health replicas significantly reduces +the likelihood of losing replication states if the primary fails after executing the command. +For example, consider a scenario where the target primary node `B` is finalizing a slot migration. +Before the `SETSLOT` command is replicated to its replica node `B’`, `B` might send a cluster `PONG` +message to the source primary node `A`, promoting `A` to relinquish its ownership of the slot in question. +If `B` crashes right after this point, the replica node `B’`, which could be elected as the new primary, +would not be aware of the slot ownership transfer without the successful replication of `SETSLOT`. +This would leave the slot without an owner, leading to potential data loss and cluster topology inconsistency. + +#### Election in empty shards + +Starting from Valkey 8.0, Valkey clusters introduce the ability to elect a primary in empty shards. +This behavior ensures that even when a shard is in the process of receiving its first slot, +a primary can be elected. This prevents scenarios where there would be no primary available in the +empty shard to handle redirected requests from the official slot owner, +thereby maintaining availability during the live resharding. + ### ASK redirection In the previous section, we briefly talked about ASK redirection. Why can't @@ -550,6 +580,16 @@ Slots migration is explained in similar terms but with different wording (for the sake of redundancy in the documentation) in the `CLUSTER SETSLOT` command documentation. +Starting from Valkey 8.0, when the primary in either the source or target shard fails during live resharding, +the primary in the other shard will automatically attempt to update its migrating/importing state to correctly pair +with the newly elected primary. If this update is successful, the ASK redirection will continue functioning without +requiring administrator intervention. In the event that slot migration fails, administrators can manually resume +the interrupted slot migration by running the command `valkey-cli --cluster fix `. + +Additionally, since Valkey 8.0, replicas are now able to return `ASK` redirects during slot migrations. +This capability was previously unavailable, as replicas were not aware of ongoing slot migrations in earlier versions. +See the [READONLY](../commands/readonly.md) command. + ### Client connections and redirection handling To be efficient, Valkey Cluster clients maintain a map of the current slot