-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): handle inconsistency between data and commit in ReadRecoveryPoints #545
Conversation
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
03e72df
to
284c477
Compare
7048516
to
11f87a6
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #545 +/- ##
==========================================
+ Coverage 61.19% 61.40% +0.20%
==========================================
Files 144 144
Lines 19213 19270 +57
==========================================
+ Hits 11757 11832 +75
+ Misses 6867 6851 -16
+ Partials 589 587 -2 ☔ View full report in Codecov by Sentry. |
return nil, nil, err | ||
last = s.getLastLogSequenceNumber(cit, dit, first) | ||
if last == nil { | ||
s.logger.Warn("the last must exist but could not be found.", zap.Stringer("first", first)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a case that could happen? If it happend, rp.CommittedLogEntry.Last
[L33] is nil and it means maybe trimmed
[internal/storagenode/logstream/executor.go:L650]. Are there any side effects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cannot happen because the first log entry already exists. The first log entry is also the last if only a single log entry exists. Therefore, this if block
is unnecessary but just added for debugging in weird situations.
These similar anomalies, including this if block
, can be fixed through synchronization. But as you said, we need to know whether this replica has been trimmed. I think the global low watermark can be helpful.
@hungryjang, I will recheck this PR since it can be affected by #492. |
284c477
to
ece04d6
Compare
11f87a6
to
5957cad
Compare
5957cad
to
47deb7d
Compare
e33a700
to
d2dd322
Compare
d2dd322
to
aca606d
Compare
aca606d
to
0f02a73
Compare
73aefd7
to
2859a89
Compare
2859a89
to
b76599c
Compare
@hungryjang, can you take a look at this PR? I will finish this issue with this PR and make a new follow-up PR if a new issue arises. Since it passes a long time, I can't remember other issues about those. 🥲 |
…coveryPoints Previously the method internal/(*storage).ReadRecoveryPoints returned an error when it found an inconsistency between data and commit for log entries, which could happen when there was no data for a commit. It should not have happened when we used the unified database in storage, so the method returned an error. However, using separate databases in storage and turning off the WAL sync option can happen; for instance, the data for a log entry can be lost, although the commit for that log entry is synced. This PR makes the ReadRecoveryPoints not return an error for the inconsistency between data and commit for a log entry. It tries to find the first and last log entries that have no inconsistency. If there are no valid first and last, it returns nil for them and lets them be resolved through synchronization.
b76599c
to
8832c06
Compare
What this PR does
Previously the method
internal/(*storage).ReadRecoveryPoints
returned an error when it found an inconsistency between data and commit for log entries, which could happen when there was no data for a commit. It should not have happened when we used the unified database in storage, so the method returned an error.However, using separate databases in storage and turning off the WAL sync option can happen; for instance, the data for a log entry can be lost, although the commit for that log entry is synced.
This PR makes the ReadRecoveryPoints not return an error for the inconsistency between data and commit for a log entry. It tries to find the first and last log entries that have no inconsistency. If there are no valid first and last, it returns nil for them and lets them be resolved through synchronization.