Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Don't swallow ErrEpochAlreadyIncremented #36942

Merged
merged 1 commit into from
Apr 19, 2019

Conversation

bdarnell
Copy link
Contributor

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you when the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes #35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.

@bdarnell bdarnell requested review from tbg and a team April 18, 2019 22:12
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking at this issue I was surprised how subtle the guarantee that our status.Expiration becomes the base of the liveness CPut is. Can you think of any assertions we could add to make sure this is true?

At the very least, leave a comment on requestLeaseAsync to highlight the role of the status (I don't think we use much of it, so it could be tempting to refactor things there, introducing bugs by accident)

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell)


pkg/storage/replica_range_lease.go, line 311 at r1 (raw file):
The explanation of why we can't just go ahead is awkwardly broken up over these two paragraphs. Consider editing from this line on (and giving another pass for clarity to my text):

without having its epoch incremented, i.e. that it was in fact live at status.Timestamp. It would be incorrect to simply proceed to sending our lease request since our lease.Start may precede the start timestamp of the predecessor lease (i.e. likely that put into place by whoever beat us to the epoch increment), and so under this lease this node's timestamp cache would not necessarily reflect all reads served by the other leaseholder.
...
ErrEpochAlreadyIncremented is not an unusual error to observe, so we don't log it.

@tbg
Copy link
Member

tbg commented Apr 19, 2019

CI failed because of a known flake (now skipped): #36948

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you *when* the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes cockroachdb#35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.
@bdarnell
Copy link
Contributor Author

I've updated the new comment, using part of your new text (minus the parenthetical about the node who completed the IncrementEpoch having a lease - if that node managed to get a lease, our lease would fail because PrevLease wouldn't match. The problematic case is when the node whose epoch we're trying to increment still has the lease, and the node that wins the race to perform the IncrementEpoch loses the race to RequestLease).

I'm not sure what specifically you're looking for in a requestLeaseAsync comment, but later I'm going to do another pass over this file while it's fresh in my mind to document everything I can.

@bdarnell
Copy link
Contributor Author

bors r=tbg

craig bot pushed a commit that referenced this pull request Apr 19, 2019
36942: storage: Don't swallow ErrEpochAlreadyIncremented r=tbg a=bdarnell

IncrementEpoch was failing to distinguish between the request that
caused the increment and another caller making the same increment.
This is incorrect since a successful increment tells you *when* the
node was confirmed dead, while relying on another node's increment
leaves this uncertain.

In rare cases involving a badly overloaded cluster, this could result
in inconsistencies (non-serializable transactions) due to incorrect
timestamp cache management.

Fixes #35986

Release note (bug fix): Fix a rare inconsistency that could occur on
badly overloaded clusters.

Co-authored-by: Ben Darnell <ben@bendarnell.com>
@craig
Copy link
Contributor

craig bot commented Apr 19, 2019

Build succeeded

@craig craig bot merged commit 8de884a into cockroachdb:master Apr 19, 2019
@tbg
Copy link
Member

tbg commented Apr 19, 2019

Comment looks really good now, thanks!

@bdarnell bdarnell deleted the epoch-already-incremented branch April 21, 2019 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: scrub/all-checks/tpcc/w=1000 failed
3 participants