-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
matchIndex lower bound for nextIndex #5630
Conversation
node->second.sent_idx = std::max(this_match, node->second.match_idx + 1); As expected, |
mku-pr5341impl@75575 aka 20230907.21 vs main ewma over 20 builds from 75234 to 75570 Click to see tablemain
mku-pr5341impl
|
node->second.sent_idx = std::max(this_match, node->second.match_idx); This fails the The test failure appears to be a liveness issue s.t. n[0] fails to repair its log (added custom logging to raft.h and several
Hypothesis: The NACK should include the prevLogIndex from the AE request instead of n[0]'s highest possible match. |
(slightly modified logging)
Since we send optimistically-late AEs, we may receive later-than-precise NACKs, and shouldn't advance |
There are many Safe assignments we can make to As a thought experiment, consider a primary trying to catch up a follower within the same term, where each AE contains only a single entry.
We are able to test some liveness scenarios with the scenario driver, but we don't simulate message duplication or re-ordering there. So I don't think we can create a test that proves the benefit of this new lower-bound, but believe it is worth adding. |
This change uses both bounds, and passes the tests I've tried:
I also understand some of the confusion around the
Our spec also uses So
That differs from
Gluing that all together, I think that the spec change to match my proposed implementation assignment above, would be:
|
The slightly improved |
725f932
to
64fb336
Compare
64fb336
to
bd25e40
Compare
#5341