Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server with raft protocol 2 joining cluster with raft protocol 3 #11867

Closed
alexiri opened this issue Jan 17, 2022 · 2 comments · Fixed by #12362
Closed

Server with raft protocol 2 joining cluster with raft protocol 3 #11867

alexiri opened this issue Jan 17, 2022 · 2 comments · Fixed by #12362
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/raft type/bug

Comments

@alexiri
Copy link
Contributor

alexiri commented Jan 17, 2022

Nomad version

1.2.3

Operating system and Environment details

CentOS Stream 8

Issue

In my configuration, I set raft_protocol to 3 some time ago to upgrade all my servers. Now that I'm upgrading to 1.2.3 I wanted to remove that hardcoded version in case at some point there's a version 4. I removed it from one server and I was surprised to see it rejoin the cluster with a protocol of 2:

Node       ID                 Address      State     Voter  RaftProtocol
XXX27      ec4c9e05xxx        [xxx]:4647   leader    true   3
XXX28      f57cdac1xxx        [xxx]:4647   follower  true   3
XXX29      cecc6acdxxx        [xxx]:4647   follower  true   3
XXX30      03ee1ddbxxx        [xxx]:4647   follower  true   2
XXX31      c7da621cxxx        [xxx]:4647   follower  true   3

If I understand the documentation correctly, this is not supposed to be possible:

This means that once a cluster has been upgraded with servers all running Raft protocol version 3, it will no longer allow servers running any older Raft protocol versions to be added.

(https://www.nomadproject.io/docs/upgrade/upgrade-specific#upgrading-to-raft-protocol-3)

Reproduction steps

  1. Create cluster with raft_protocol=3
  2. Remove raft_protocol configuration from one node
  3. Node rejoins the cluster with raft_protocol=2

Expected Result

I expected node XXX30 to rejoin the cluster with raft_protocol=3 even though that option was no longer forced by it's configuration.

Actual Result

XXX30 rejoins with raft_protocol=2, even though the rest of the cluster is using 3.

Job file (if appropriate)

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

@tgross
Copy link
Member

tgross commented Jan 19, 2022

@alexiri that does seem like a bug that the v2 server would not be rejected. But the default value is currently 2. It will be 3 after #11572 lands (which is scheduled for Nomad 1.3.0)

@tgross tgross added stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/raft labels Jan 19, 2022
@tgross tgross added this to Needs Triage in Nomad - Community Issues Triage via automation Jan 19, 2022
@tgross tgross moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage Jan 19, 2022
@lgfa29 lgfa29 self-assigned this Feb 16, 2022
Nomad - Community Issues Triage automation moved this from Needs Roadmapping to Done Mar 24, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/raft type/bug
Projects
Development

Successfully merging a pull request may close this issue.

3 participants