-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: return error from split/merge lock acquisition #19448
Conversation
Reviewed 1 of 1 files at r1, 1 of 1 files at r2. pkg/storage/store.go, line 3878 at r1 (raw file):
Comments from Reviewable |
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. pkg/storage/store.go, line 3878 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Note that the Comments from Reviewable |
Reviewed 1 of 1 files at r1. pkg/storage/replica.go, line 4536 at r2 (raw file):
Shouldn't you return the error to the caller in order to force error the Raft command? The code in pkg/storage/store.go, line 3878 at r1 (raw file):
Was this addition motivated by a real concern. Comments from Reviewable |
Also report the error to sentry because we suspect it to have caused the bug referenced below in versions of CockroachDB not running with this commit. See cockroachdb#19172.
Review status: 0 of 3 files reviewed at latest revision, 3 unresolved discussions. pkg/storage/replica.go, line 4536 at r2 (raw file): Previously, petermattis (Peter Mattis) wrote…
Good point. I definitely want to learn about the errors out there in the wild though, so I added reporting to sentry. PTAL! pkg/storage/store.go, line 3878 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
No, just paranoia given the recent deadlock in TestRefreshPendingCmds. You're right that the abundance of lock acquisition makes this too paranoid. Removed the commit. Comments from Reviewable |
Reviewed 2 of 4 files at r3. Comments from Reviewable |
LGTM, but the PR title is incorrect now. |
Review status: 2 of 3 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. Comments from Reviewable |
TFTRs, everyone! @nvanbenschoten changed the title. |
I stumbled upon this PR while cleaning up #38954 and am now confused about why setting |
If this operation fails then we can't simply return an error and not apply the command. We need to be deterministic at this point. The only option we have is to fatal. See cockroachdb#19448 (comment). The commit also removes the crash reporting added here, which doesn't appear to have ever fired after two years. Release note: None
Not sure what we were thinking there, |
39158: storage: clean up below-Raft Split and Merge locking r=bdarnell a=nvanbenschoten This PR contains three related cleanups that I stumbled into while cleaning up raft entry application in preparation for #38954. The first cleanup is that we avoid improperly handling errors that may be returned when acquiring the split/merge locks. If this operation fails then we can't simply return an error and not apply the command. We need to be deterministic at this point. The only option we have is to fatal. See #19448 (comment). The second cleanup is that we then remove stale code that was attempting to recover from failed split and merge application. This error handling only made sense in a pre-proposer evaluated KV world. The third commit addresses a TODO to assert that the RHS range during a split is not initialized. The TODO looks like it went in before proposer evaluated KV, which protects us against both replays and reproposals. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
See #19172.