Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: don't panic when looking for term conflicts #36

Merged
merged 6 commits into from
Mar 15, 2023

Conversation

pav-kv
Copy link
Contributor

@pav-kv pav-kv commented Mar 14, 2023

This commit fixes a hypothetical panic that may occur when a stale MsgApp
message arrives to a follower. The conflict searching algorithm in
findConflictByTerm may return a log index which is not present in the log, and
thus the raftLog.term() method may return an error. It is safe to ignore this
error and send MsgAppResp with the found index and a zero LogTerm.

This commit also restores the behaviour that existed before commit d0fb0cd.
Back then, the term() function would silently return 0 instead of an error, and
a zero LogTerm would be sent with the rejecting MsgAppResp. After that commit,
there is a new possible panic. We remove the possibility of this panic here.

@pav-kv pav-kv force-pushed the protect-from-panic branch 2 times, most recently from b3d210c to d0a5d0d Compare March 14, 2023 14:48
@pav-kv pav-kv marked this pull request as ready for review March 14, 2023 14:48
@pav-kv
Copy link
Contributor Author

pav-kv commented Mar 14, 2023

@tbg PTAL

@tbg tbg self-requested a review March 14, 2023 14:56
Copy link
Collaborator

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks! I left some suggestions to a have fewer "unexpected" return values and no case in which the term is not populated. The goal would be to remove all edge cases and make it more transparent what happens in the truncation scenarios, etc.

It might also be good to add some targeted unit testing for findConflictByTerm. I saw that you added a test case in TestFastLogRejection, but a few more wouldn't hurt. I should have added that right when we introduced findConflictByTerm, sorry about that.

Approving preemptively because the current code is "good enough" and I'm fairly certain it's correct (mod comments). The improvements would make it easier to be convinced that it's correct in a month or two but I'll rely on your judgment of which further improvements are worth it.

raft.go Show resolved Hide resolved
raft_test.go Show resolved Hide resolved
log.go Outdated Show resolved Hide resolved
log.go Outdated Show resolved Hide resolved
log.go Outdated Show resolved Hide resolved
raft.go Outdated Show resolved Hide resolved
raft.go Outdated Show resolved Hide resolved
raft.go Outdated Show resolved Hide resolved
@pav-kv pav-kv force-pushed the protect-from-panic branch 2 times, most recently from 685e800 to bdae85a Compare March 15, 2023 20:30
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
This commit fixes a hypothetical panic that may occur when a stale MsgApp
message arrives to a follower. The conflict searching algorithm in
findConflictByTerm may return a log index which is not present in the log, and
thus the raftLog.term() method may return an error. It is safe to ignore this
error and send MsgAppResp with the found index and a zero LogTerm.

This commit also restores the behaviour that existed before commit d0fb0cd.
Back then, the term() function would silently return 0 instead of an error, and
a zero LogTerm would be sent with the rejecting MsgAppResp. After that commit,
there is a new possible panic. We remove the possibility of this panic here.

Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
@pav-kv
Copy link
Contributor Author

pav-kv commented Mar 15, 2023

@tbg Addressed comments, and added unit tests for findConflictByTerm.

@pav-kv
Copy link
Contributor Author

pav-kv commented Mar 15, 2023

CC @ahrtr: This removes a panic condition. Please merge since this is now approved, or feel free to review too.

@tbg I modified this PR after approval, but it's equivalent. Feel free to take a final look though, and please merge if you see this earlier.

Thank you.

@tbg tbg merged commit 5fe1c31 into etcd-io:main Mar 15, 2023
@pav-kv pav-kv deleted the protect-from-panic branch March 15, 2023 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants