-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s/disk_log_impl: don't prefix-truncate empty segments #19790
s/disk_log_impl: don't prefix-truncate empty segments #19790
Conversation
Consider the following sequence of events: 1. there is a single segment in the log, with offsets [0-9] 2. we call prefix_truncate(10) 3. concurrently, another batch of 5 messages is being appended. 4. empty segment with base_offset=10, dirty_offset=9 is created 5. the appended batch is placed at offsets 10-14 Previously, the empty segment would have passed the dirty_offset check and (after waiting for the append to finish) would get deleted (including the data at offsets 10-14). Check also the segment base_offset to prevent that. Fixes redpanda-data#19632
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find!
Should this be backported? It seems a scary
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/50132#019009b0-95f8-46b8-8800-2be603695293 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/50132#019009b0-95ff-4b18-89f5-2e06688d8860 |
Good point, usually I'm reluctant to backport similar fixes without reported bugs, but this seems low risk. |
/backport v24.1.x |
/backport v24.1.x |
/backport v23.3.x |
Failed to create a backport PR to v23.3.x branch. I tried:
|
// base_offset check is for the case of an empty segment | ||
// (where dirty = base - 1). We don't want to remove it because | ||
// batches may be concurrently appended to it and we should keep them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes. we probably should have had a lot more concurrency limitations in disk_log_impl. another thing that @andrwng's work on local storage v2 would address.
Consider the following sequence of events:
Previously, the empty segment would have passed the dirty_offset check and (after waiting for the append to finish) would get deleted (including the data at offsets 10-14). Check also the segment base_offset to prevent that.
Fixes #19632
Backports Required
Release Notes