-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard from concurrent elections #21361
Guard from concurrent elections #21361
Conversation
006353b
to
62aa52f
Compare
/dt |
1 similar comment
/dt |
skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51446#0190a954-a6c1-44e8-bbf6-044ccd4e04d1: |
DT failure: #21376 |
62aa52f
to
a4e6535
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51524#0190b743-1966-448c-b79e-880fdf3b39f0 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51591#0190bcb2-9e1d-4673-96da-8dac9e306a68 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51635#0190c012-df11-4b80-9bf3-be0fadc9ac92 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51726#0190c642-c227-4057-b75f-39430f3c9822 |
a4e6535
to
deabb87
Compare
I removed the |
src/v/raft/vote_stm.cc
Outdated
@@ -479,4 +479,13 @@ ss::future<> vote_stm::self_vote() { | |||
auto m = _replies.find(_ptr->self()); | |||
m->second.set_value(reply); | |||
} | |||
|
|||
void vote_stm::lose() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: a more descriptive name would be nicer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to lose_election
, do you think it's better? (I'm not quite sure what else we can lose in a class whose purpose is to run an election)
Also if we're after naming things right I'd rename half of the vote
to election
to clarify whether we arrange an election or merely vote in an election already organised. Say, consensus::vote()
vs vote_stm::vote
confused me, as the latter runs an election, not just votes really. How conservative are we regarding existing names? Is it okay to rename things if I touch the code around them?
81fdfbe
to
ef89f73
Compare
failure in previous build unrelated: https://redpandadata.atlassian.net/browse/CORE-5736 |
ef89f73
to
5e8643a
Compare
vstm has already been moved-from, it makes no sense to capture it here
will be used later to guard from concurrent elections
Use a lock held by critical part of the voting process, released when decision is made
5e8643a
to
8a933e6
Compare
/backport v24.1.x |
/backport v23.3.x |
/backport v23.2.x |
Oops! Something went wrong. |
Failed to create a backport PR to v23.3.x branch. I tried:
|
@@ -1046,9 +1046,7 @@ void consensus::dispatch_vote(bool leadership_transfer) { | |||
} | |||
// background | |||
ssx::spawn_with_gate( | |||
_bg, | |||
[vstm = std::move(vstm), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not just dead, but i wonder why clang-tidy didn't warn against what looks like a use-after-move (or at least a move-after-move).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://clang.llvm.org/extra/clang-tidy/checks/bugprone/use-after-move.html#use:
An exception to this are objects of type std::unique_ptr, std::shared_ptr and std::weak_ptr, which have defined move behavior (objects of these classes are guaranteed to be empty after they have been moved from). Therefore, an object of these classes will only be considered to be used if it is dereferenced, i.e. if operator*, operator-> or operator[] (in the case of std::unique_ptr<T []>) is called on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh interesting TIL
Backport for "Merged Guard from concurrent elections" #21361
/backport v24.2.x |
At the moment nothing prevents a raft node from being a candidate in two elections at the same time, two
vote_stm
s actively working. This may lead to a nasty race condition where one of them has succeeded (becoming a leader, last heartbeat goes max), and then the other one fails (role becomes follower, no changes to last heartbeat). As a result, a node never becomes a candidate, as it's not a leader but has max last heartbeat.This PR is mostly should guard against concurrent elections with the same candidate. Some minor fixes and refactorings included.
Backports Required
Release Notes