Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replication: fix potential panic during upgrades #17476

Merged
merged 2 commits into from
Jun 12, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Jun 9, 2023

If the authoritative region has been upgraded to a version of Nomad that has new replicated objects (such as ACL Auth Methods, ACL Binding Rules, etc.), the non-authoritative regions will start replicating those objects as soon as their leader is upgraded. If a server in the non-authoritative region is upgraded and then becomes the leader before all the other servers in the region have been upgraded, then it will attempt to write a Raft log entry that the followers don't understand. The followers will then panic.

Add same the minimum version checks that we do for RPC writes to the leader's replication loop.


This will need to be backported to 1.5.x but hand-picked over to 1.4.x because some of these changes are only in 1.5.x

If the authoritative region has been upgraded to a version of Nomad that has new
replicated objects (such as ACL Auth Methods, ACL Binding Rules, etc.), the
non-authoritative regions will start replicating those objects as soon as their
leader is upgraded. If a server in the non-authoritative region is upgraded and
then becomes the leader before all the other servers in the region have been
upgraded, then it will attempt to write a Raft log entry that the followers
don't understand. The followers will then panic.

Add same the minimum version checks that we do for RPC writes to the leader's
replication loop.
nomad/leader.go Outdated Show resolved Hide resolved
Copy link
Member

@jrasell jrasell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@tgross tgross added the backport/1.5.x backport to 1.5.x release line label Jun 12, 2023
@tgross tgross merged commit cff3c9b into main Jun 12, 2023
24 of 25 checks passed
@tgross tgross deleted the auth-object-replication-panic branch June 12, 2023 12:53
@tgross
Copy link
Member Author

tgross commented Jun 12, 2023

Automatically backported to 1.5.x and hand backported to 1.4.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.5.x backport to 1.5.x release line theme/crash type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants