Fix CI failure in test_concurrent_append_flush #15271

travisdowns · 2023-12-02T01:34:59Z

In test_concurrent_append_flush, which is a fuzzer style test,
we now get() all futures returned by flush calls during the fuzz
portion, instead of only the last flush.

It is possible in some cases for prior futures to be unavailable
even after the last future has resolved which caused occasional
CI failures. See 13035 for more analysis.

Fixes #13035.

Backports Required

Release Notes

none

Add stable_offset, flushed_offset and merged writes count to the stream appender.

run_concurrent_append_flush is a fuzzer-like test and we may have hard-to-diagnose failures there (e.g., see issue redpanda-data#13035) and to help diagnose it we want to capture some information from the segment_appender at each step of the test. Introduce segment_appender_info to do this.

andijcr

only issue when the first action is a WAIT_APPEND, the rest are nits

andijcr · 2023-12-04T10:52:12Z

src/v/storage/tests/log_segment_appender_test.cc

-      config::mock_binding<size_t>(std::move(fallocate_size)));
-    auto appender = make_segment_appender(f, resources);
+    auto seg_file = open_file(filename);
+    storage::storage_resources resources(config::mock_binding(+fallocate_size));


nit: stray + in +fallocate_size

@andijcr - actually it's not stray, it's needed to make it an rvalue, because config::mock_binding is declared in a way that requires an rvalue argument if you want to rely on template parameter deduction, unfortunately. I do plan to fix this, but this is one workaround for now.

I do plan to fix this

To clarify I mean mock_binding could be fixed to avoid this problem.

andijcr · 2023-12-04T11:08:25Z

src/v/storage/tests/log_segment_appender_test.cc

+                    vassert(false, "bad kind");
+                }();
+
+                return fmt::format("{:12}: {}", astr + extra, info.to_string());


not related but one day we should bring magic_enum into the codebase or reimplement part of the functionality

@andijcr absolutely! I've never used magic_enum specifically but I definitely feel the pain of enum boilerplate every time I create a new enum in C++.

andijcr · 2023-12-04T11:25:53Z

src/v/storage/tests/log_segment_appender_test.cc

+                break;
+            case action::FLUSH:
+                futs.push_back(appender.flush());
+                // current_action.flush_future = appender.flush();


nit: stray comment?

Thanks, fixed!

src/v/storage/tests/log_segment_appender_test.cc

Relates to log_segment_appender_test::test_concurrent_append_flush, which is a fuzzer-style test, and output it when we fail. In storage_single_thread_rpunit concurrent flush test we now log test context which will be printed if the test fails. Critically this includes the seem used to generate the random series of actions to be performed on the appender. In addition we generate a single seed per invocation and then use that seed rather than the random helper methods which use an unspecified random seed each time. Finally we record more information about the operations performed in test and output the full action sequence on failure. Issue redpanda-data#13035.

In test_concurrent_append_flush, which is a fuzzer style test, we now get() all futures returned by flush calls during the fuzz portion, instead of only the last flush. It is possible in some cases for prior futures to be unavailable even after the last future has resolved which caused occasional CI failures. See 13035 for more analysis. Fixes redpanda-data#13035.

travisdowns · 2023-12-14T14:51:48Z

Looks like debug unit tests are timing out, I'll have a look.

vbotbuildovich · 2023-12-14T19:53:56Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42799#018c699d-385a-4fa6-a991-dc7fa77034bc

vbotbuildovich · 2023-12-18T15:30:32Z

/backport v23.2.x

vbotbuildovich · 2023-12-18T15:31:29Z

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-15271-v23.2.x-789 remotes/upstream/v23.2.x
git cherry-pick -x 664115acdc30e06d0509addbd7e33c99b7162084 4e4a1e32d808fbf4a37cfd6f146fc6f8cbdfb5fb b02c28c35a05121d8a82e495c89e81a4bb36c20e cc82d0d14cfbd811de014d9c5de0200dbe3b6882

Workflow run logs.

piyushredpanda · 2023-12-18T15:36:12Z

/backport v23.3.x

travisdowns added 2 commits December 1, 2023 22:29

More fields in segment_appender ostream<<

664115a

Add stable_offset, flushed_offset and merged writes count to the stream appender.

github-actions bot added the area/redpanda label Dec 2, 2023

travisdowns requested a review from andijcr December 2, 2023 01:36

travisdowns force-pushed the td-segment-appender-flush-order branch from 935fd00 to 081b508 Compare December 2, 2023 23:35

andijcr reviewed Dec 4, 2023

View reviewed changes

andijcr previously approved these changes Dec 11, 2023

View reviewed changes

travisdowns dismissed andijcr’s stale review via 222b8df December 11, 2023 18:57

travisdowns force-pushed the td-segment-appender-flush-order branch from 081b508 to 222b8df Compare December 11, 2023 18:57

travisdowns added 2 commits December 11, 2023 15:59

travisdowns force-pushed the td-segment-appender-flush-order branch from 222b8df to cc82d0d Compare December 11, 2023 18:59

travisdowns requested a review from andijcr December 11, 2023 19:00

andijcr approved these changes Dec 11, 2023

View reviewed changes

travisdowns merged commit 3e44a5d into redpanda-data:dev Dec 18, 2023
19 checks passed

vbotbuildovich mentioned this pull request Dec 18, 2023

[v23.2.x] Fix CI failure in test_concurrent_append_flush #15727

Closed

This was referenced Dec 18, 2023

[v23.3.x] CI Failure (critical check f.available() has failed) in test_concurrent_append_flush #15728

Closed

[v23.3.x] Fix CI failure in test_concurrent_append_flush #15729

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CI failure in test_concurrent_append_flush #15271

Fix CI failure in test_concurrent_append_flush #15271

travisdowns commented Dec 2, 2023

andijcr left a comment

andijcr Dec 4, 2023

travisdowns Dec 11, 2023

andijcr Dec 11, 2023

travisdowns Dec 11, 2023

andijcr Dec 4, 2023

travisdowns Dec 11, 2023

andijcr Dec 4, 2023

travisdowns Dec 11, 2023

travisdowns commented Dec 14, 2023

vbotbuildovich commented Dec 14, 2023

vbotbuildovich commented Dec 18, 2023

vbotbuildovich commented Dec 18, 2023

piyushredpanda commented Dec 18, 2023

Fix CI failure in test_concurrent_append_flush #15271

Fix CI failure in test_concurrent_append_flush #15271

Conversation

travisdowns commented Dec 2, 2023

Backports Required

Release Notes

andijcr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

travisdowns commented Dec 14, 2023

vbotbuildovich commented Dec 14, 2023

vbotbuildovich commented Dec 18, 2023

vbotbuildovich commented Dec 18, 2023

piyushredpanda commented Dec 18, 2023