-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CI failure in test_concurrent_append_flush #15271
Fix CI failure in test_concurrent_append_flush #15271
Conversation
Add stable_offset, flushed_offset and merged writes count to the stream appender.
run_concurrent_append_flush is a fuzzer-like test and we may have hard-to-diagnose failures there (e.g., see issue redpanda-data#13035) and to help diagnose it we want to capture some information from the segment_appender at each step of the test. Introduce segment_appender_info to do this.
935fd00
to
081b508
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only issue when the first action is a WAIT_APPEND, the rest are nits
config::mock_binding<size_t>(std::move(fallocate_size))); | ||
auto appender = make_segment_appender(f, resources); | ||
auto seg_file = open_file(filename); | ||
storage::storage_resources resources(config::mock_binding(+fallocate_size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: stray + in +fallocate_size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andijcr - actually it's not stray, it's needed to make it an rvalue, because config::mock_binding
is declared in a way that requires an rvalue argument if you want to rely on template parameter deduction, unfortunately. I do plan to fix this, but this is one workaround for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do plan to fix this
To clarify I mean mock_binding
could be fixed to avoid this problem.
vassert(false, "bad kind"); | ||
}(); | ||
|
||
return fmt::format("{:12}: {}", astr + extra, info.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related but one day we should bring magic_enum into the codebase or reimplement part of the functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andijcr absolutely! I've never used magic_enum specifically but I definitely feel the pain of enum boilerplate every time I create a new enum
in C++.
break; | ||
case action::FLUSH: | ||
futs.push_back(appender.flush()); | ||
// current_action.flush_future = appender.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: stray comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
081b508
to
222b8df
Compare
Relates to log_segment_appender_test::test_concurrent_append_flush, which is a fuzzer-style test, and output it when we fail. In storage_single_thread_rpunit concurrent flush test we now log test context which will be printed if the test fails. Critically this includes the seem used to generate the random series of actions to be performed on the appender. In addition we generate a single seed per invocation and then use that seed rather than the random helper methods which use an unspecified random seed each time. Finally we record more information about the operations performed in test and output the full action sequence on failure. Issue redpanda-data#13035.
In test_concurrent_append_flush, which is a fuzzer style test, we now get() all futures returned by flush calls during the fuzz portion, instead of only the last flush. It is possible in some cases for prior futures to be unavailable even after the last future has resolved which caused occasional CI failures. See 13035 for more analysis. Fixes redpanda-data#13035.
222b8df
to
cc82d0d
Compare
Looks like debug unit tests are timing out, I'll have a look. |
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42799#018c699d-385a-4fa6-a991-dc7fa77034bc |
/backport v23.2.x |
Failed to create a backport PR to v23.2.x branch. I tried:
|
/backport v23.3.x |
In test_concurrent_append_flush, which is a fuzzer style test,
we now get() all futures returned by flush calls during the fuzz
portion, instead of only the last flush.
It is possible in some cases for prior futures to be unavailable
even after the last future has resolved which caused occasional
CI failures. See 13035 for more analysis.
Fixes #13035.
Backports Required
Release Notes