raft: in append_entries skip batches that we already have #17895

ztlpn · 2024-04-16T18:22:18Z

This is important for the case when we already have all batches locally (possible if e.g. the request was delayed/duplicated). In this case we don't want to truncate, otherwise we might lose already committed data.

Fixes #17731

Backports Required

Release Notes

Bug Fixes

Fix incorrect log truncations caused by delayed replication requests.

It is similar to for_each_ref, but advances only if the consumer returns ss::stop_iteration::no. I.e. the batch where the consumer stopped remains available for reading by subsequent consumers.

Extract configurations using a wrapping batch consumer instead.

vbotbuildovich · 2024-04-16T20:23:50Z

new failures in https://buildkite.com/redpanda/redpanda/builds/47884#018ee860-7f02-47bd-90b3-3b3f37d26d1c:

"rptest.tests.cluster_bootstrap_test.ClusterBootstrapUpgrade.test_change_bootstrap_configs_after_upgrade.empty_seed_starts_cluster=False"

vbotbuildovich · 2024-04-16T20:28:26Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/47884#018ee860-7f09-431e-9e3a-b7eeb9d6d3f8

emaxerrno · 2024-04-17T16:25:29Z

oh this is cool.

bharathv

neat fix, mostly nits, makes sense to me.

bharathv · 2024-04-17T20:20:01Z

src/v/raft/consensus.cc

+            co_return reply;
+        }
+
+        co_return co_await do_append_entries(std::move(r));


I think the only deviation is this do_append_entries is out of the try block now (which is still logically same).

Yeah, it is a bit weird that we printed an exception in the recursive call as a "truncation failure". OTOH it is written to not throw exceptions, so probably not a big difference.

Thanks. Had the same question

bharathv · 2024-04-17T21:02:48Z

src/v/raft/consensus.cc

+              "current state: {}",
+              batch_prev_log_index,
+              last_matched,
+              meta());


nit: maybe useful to log current log offset state (lstats)

IMO what we have in meta() should be mostly enough (commit_index and dirty_offset are there)

bharathv · 2024-04-17T21:09:30Z

src/v/raft/consensus.cc

+        struct find_mismatch_consumer {
+            ss::future<ss::stop_iteration>
+            operator()(const model::record_batch& b) {
+                model::offset last_offset = last_matched


We have this method in record_batch()

model::offset last_offset() const { return _header.last_offset();

I"m curious if we can use that instead of computing it.

That's what I used initially :) But I forgot that normally replicated batches don't have base_offset set yet (even though recovery batches do!) and we have to calculate the offsets manually. This is actually pretty confusing, I wonder if we should add some non-serialized flag to the batch header indicating that the offsets are still not set.

src/v/raft/consensus.cc

ztlpn · 2024-04-17T21:13:49Z

Ok, 300 iterations of the test_with_relaxed_acks suite passed successfully (before the fix it failed after a few dozen runs).

mmaslankaprv

great !

This is important for the case when we already have _all_ batches locally (possible if e.g. the request was delayed/duplicated). In this case we don't want to truncate, otherwise we might lose already committed data. Fixes redpanda-data#17731

ztlpn · 2024-04-18T21:00:32Z

test failure is #17847 (and some result publishing woes)

vbotbuildovich · 2024-04-18T23:03:32Z

/backport v23.3.x

vbotbuildovich · 2024-04-18T23:03:33Z

/backport v23.2.x

vbotbuildovich · 2024-04-18T23:04:23Z

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17895-v23.3.x-741 remotes/upstream/v23.3.x
git cherry-pick -x 9753c9a30874dd24f52bc7ae4756bcfb191fb75e 93428112eb256a4daabe107489f33bc6358bfa14 2f432c23d35be188b7f0ccbe1cc00b4f6f7d653a f0c5772188dcd1c25f169e97dcf1ac802ae991be

Workflow run logs.

vbotbuildovich · 2024-04-18T23:04:34Z

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17895-v23.2.x-330 remotes/upstream/v23.2.x
git cherry-pick -x 9753c9a30874dd24f52bc7ae4756bcfb191fb75e 93428112eb256a4daabe107489f33bc6358bfa14 2f432c23d35be188b7f0ccbe1cc00b4f6f7d653a f0c5772188dcd1c25f169e97dcf1ac802ae991be

Workflow run logs.

ztlpn added 3 commits April 16, 2024 20:19

raft: coroutinize do_append_entries

9753c9a

model: add record_batch_reader::peek_each_ref

9342811

It is similar to for_each_ref, but advances only if the consumer returns ss::stop_iteration::no. I.e. the batch where the consumer stopped remains available for reading by subsequent consumers.

raft: get rid of config_extracting_reader

2f432c2

Extract configurations using a wrapping batch consumer instead.

ztlpn requested review from bharathv and mmaslankaprv April 16, 2024 18:22

github-actions bot added the area/redpanda label Apr 16, 2024

ztlpn force-pushed the raft-fix-delayed-requests branch from 09b8df7 to 49d13fd Compare April 17, 2024 13:04

ztlpn added this to the 24.1 milestone Apr 17, 2024

bharathv reviewed Apr 17, 2024

View reviewed changes

mmaslankaprv previously approved these changes Apr 18, 2024

View reviewed changes

ztlpn dismissed mmaslankaprv’s stale review via f0c5772 April 18, 2024 15:01

ztlpn force-pushed the raft-fix-delayed-requests branch from 49d13fd to f0c5772 Compare April 18, 2024 15:01

ztlpn requested review from bharathv and mmaslankaprv April 18, 2024 15:12

bharathv approved these changes Apr 18, 2024

View reviewed changes

ztlpn merged commit ff870e1 into redpanda-data:dev Apr 18, 2024
17 checks passed

vbotbuildovich mentioned this pull request Apr 18, 2024

[v23.3.x] raft: in append_entries skip batches that we already have #17957

Closed

vbotbuildovich mentioned this pull request Apr 18, 2024

[v23.2.x] raft: in append_entries skip batches that we already have #17958

Closed

ztlpn deleted the raft-fix-delayed-requests branch April 18, 2024 23:57

ztlpn mentioned this pull request May 16, 2024

[v23.3.x] raft: in append_entries skip batches that we already have #18523

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

raft: in append_entries skip batches that we already have #17895

raft: in append_entries skip batches that we already have #17895

ztlpn commented Apr 16, 2024

vbotbuildovich commented Apr 16, 2024

vbotbuildovich commented Apr 16, 2024

emaxerrno commented Apr 17, 2024

bharathv left a comment

bharathv Apr 17, 2024

ztlpn Apr 17, 2024

dotnwat Apr 19, 2024

bharathv Apr 17, 2024

ztlpn Apr 17, 2024

bharathv Apr 17, 2024

ztlpn Apr 17, 2024

ztlpn commented Apr 17, 2024

mmaslankaprv left a comment

ztlpn commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

raft: in append_entries skip batches that we already have #17895

raft: in append_entries skip batches that we already have #17895

Conversation

ztlpn commented Apr 16, 2024

Backports Required

Release Notes

Bug Fixes

vbotbuildovich commented Apr 16, 2024

vbotbuildovich commented Apr 16, 2024

emaxerrno commented Apr 17, 2024

bharathv left a comment

Choose a reason for hiding this comment

bharathv Apr 17, 2024

Choose a reason for hiding this comment

ztlpn Apr 17, 2024

Choose a reason for hiding this comment

dotnwat Apr 19, 2024

Choose a reason for hiding this comment

bharathv Apr 17, 2024

Choose a reason for hiding this comment

ztlpn Apr 17, 2024

Choose a reason for hiding this comment

bharathv Apr 17, 2024

Choose a reason for hiding this comment

ztlpn Apr 17, 2024

Choose a reason for hiding this comment

ztlpn commented Apr 17, 2024

mmaslankaprv left a comment

Choose a reason for hiding this comment

ztlpn commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024

vbotbuildovich commented Apr 18, 2024