Skip to content

Commit

Permalink
documentation improvements from code review, changelog
Browse files Browse the repository at this point in the history
  • Loading branch information
tgross committed Dec 2, 2021
1 parent 2f35406 commit 4db2f53
Show file tree
Hide file tree
Showing 3 changed files with 96 additions and 57 deletions.
7 changes: 7 additions & 0 deletions .changelog/11572.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
```release-note:improvement
raft: The default raft protocol version is now 3.
```

```release-note:deprecation
Raft protocol version 2 is deprecated and will be removed in Nomad 1.4.0.
```
80 changes: 80 additions & 0 deletions website/content/docs/upgrade/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,83 @@ differences may require specific steps.
[node-status]: /docs/commands/node/status
[server-members]: /docs/commands/server/members
[upgrade-specific]: /docs/upgrade/upgrade-specific

## Upgrading to Raft Protocol 3

This section provides details on upgrading to Raft Protocol 3. Raft
protocol version 3 requires Nomad running 0.8.0 or newer on all
servers in order to work. Raft protocol version 2 will be removed in
Nomad 1.4.0.

To see the version of the Raft protocol in use on each server, use the
`nomad operator raft list-peers` command.

Note that the format of `peers.json` used for outage recovery is
different when running with the latest Raft protocol. See [Manual
Recovery Using
peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson)
for a description of the required format.

When using Raft protocol version 3, servers are identified by their
`node-id` instead of their IP address when Nomad makes changes to its
internal Raft quorum configuration. This means that once a cluster has
been upgraded with servers all running Raft protocol version 3, it
will no longer allow servers running any older Raft protocol versions
to be added.

### Upgrading a Production Cluster to Raft Version 3

For production raft clusters with 3 or more memebrs, the easiest way
to upgrade servers is to have each server leave the cluster, upgrade
its [`raft_protocol`] version in the `server` stanza, and then add it
back. Make sure the new server joins successfully and that the cluster
is stable before rolling the upgrade forward to the next server. It's
also possible to stand up a new set of servers, and then slowly stand
down each of the older servers in a similar fashion.

For in-place raft protocol upgrades, perform the following for each
server, leaving the leader until last to reduce the chance of leader
elections that will slow down the process:

* Stop the server
* Run `nomad server force-leave $server_name`
* Update the `raft_protocol` in the server's configuration file to 3.
* Restart the server
* Run `nomad operator raft list-peers` to verify that the `raft_vsn`
for the server is now 3.
* On the server, run `nomad agent-info` and check that the
`last_log_index` is of a similar value to the other servers. This
step ensures that raft is healthy and changes are replicating to the
new server.

### Upgrading a Single Server Cluster to Raft Version 3

If you are running a single Nomad server, restarting it in-place will
result in that server not being able to elect itself as a leader. To
avoid this, create a new [`raft.peers`][peers-json] file before
restarting the server with the new configuration. If you have `jq`
installed you can run the following script on the server's host to
write the correct `raft.peers` file:

```
#!/usr/bin/env bash
NOMAD_DATA_DIR=$(nomad agent-info -json | jq -r '.config.DataDir')
NOMAD_ADDR=$(nomad agent-info -json | jq -r '.stats.nomad.leader_addr')
NODE_ID=$(cat "$NOMAD_DATA_DIR/server/node-id")
cat <<EOF > "$NOMAD_DATA_DIR/server/raft/peers.json"
[
{
"id": "$NODE_ID",
"address": "$NOMAD_ADDR",
"non_voter": false
}
]
EOF
```

After running this script, update the `raft_protocol` in the server's
configuration to 3 and restart the server.

[peers-json]: https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson
66 changes: 9 additions & 57 deletions website/content/docs/upgrade/upgrade-specific.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,15 @@ used to document those details separately from the standard upgrade flow.

## Nomad 1.3.0

#### Default Raft Protocol Version
#### Raft Protocol Version 2 Deprecation

In Nomad 1.3.0, the default raft protocol version has been updated
to 3. If the [`raft_protocol_version`] is not explicitly set,
upgrading a server will automatically upgrade that server's raft
protocol. See the [Upgrading to Raft Protocol 3] guide below.
Raft protocol version 2 will be removed from Nomad in the next major
release of Nomad, 1.4.0.

In Nomad 1.3.0, the default raft protocol version has been updated to
3. If the [`raft_protocol_version`] is not explicitly set, upgrading a
server will automatically upgrade that server's raft protocol. See the
[Upgrading to Raft Protocol 3] guide.

## Nomad 1.2.2

Expand Down Expand Up @@ -973,57 +976,6 @@ In order to enable all
servers in a Nomad cluster must be running with Raft protocol version 3 or
later.

#### Upgrading to Raft Protocol 3

This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and
higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all
servers in order to work. See [Raft Protocol Version
Compatibility](/docs/upgrade/upgrade-specific#raft-protocol-version-compatibility)
for more details. Also the format of `peers.json` used for outage recovery is
different when running with the latest Raft protocol. See [Manual Recovery Using
peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson)
for a description of the required format.

Please note that the Raft protocol is different from Nomad's internal protocol
as shown in commands like `nomad server members`. To see the version of the Raft
protocol in use on each server, use the `nomad operator raft list-peers`
command.

When using Raft protocol version 3, servers are identified by their `node-id`
instead of their IP address when Nomad makes changes to its internal Raft quorum
configuration. This means that once a cluster has been upgraded with servers all
running Raft protocol version 3, it will no longer allow servers running any
older Raft protocol versions to be added.

~> **Warning:** If you are running a single Nomad server, restarting it
in-place will result in that server not being able to elect itself as
a leader. To avoid this, either set the Raft protocol back to 2, or
use [Manual Recovery Using
peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson)
to map the server to its node ID in the Raft quorum configuration.

The easiest way to upgrade servers is to have each server leave the cluster,
upgrade its [`raft_protocol`] version in the `server` stanza, and then add it
back. Make sure the new server joins successfully and that the cluster is stable
before rolling the upgrade forward to the next server. It's also possible to
stand up a new set of servers, and then slowly stand down each of the older
servers in a similar fashion.

For in-place raft protocol upgrades, perform the following for each
server, leaving the leader until last to reduce the chance of leader
elections that will slow down the process:

* Stop the server
* Run `nomad server force-leave $server_name`
* Update the `raft_protocol` in the server's configuration file to 3.
* Restart the server
* Run `nomad operator raft list-peers` to verify that the `raft_vsn`
for the server is now 3.
* On the server, run `nomad agent-info` and check that the
`last_log_index` is of a similar value to the other servers. This
step ensures that raft is healthy and changes are replicating to the
new server.

### Node Draining Improvements

Node draining via the [`node drain`][drain-cli] command or the [drain
Expand Down Expand Up @@ -1243,4 +1195,4 @@ deleted and then Nomad 0.3.0 can be launched.
[cap_add_exec]: /docs/drivers/exec#cap_add
[cap_drop_exec]: /docs/drivers/exec#cap_drop
[`log_file`]: /docs/configuration#log_file
[Upgrading to Raft Protocol 3]: /docs/upgrade/upgrade-specific#upgrading-to-raft-protocol-3
[Upgrading to Raft Protocol 3]: /docs/upgrade#upgrading-to-raft-protocol-3

0 comments on commit 4db2f53

Please sign in to comment.