Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: in append_entries skip batches that we already have #17895

Merged
merged 4 commits into from
Apr 18, 2024

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Apr 16, 2024

This is important for the case when we already have all batches locally (possible if e.g. the request was delayed/duplicated). In this case we don't want to truncate, otherwise we might lose already committed data.

Fixes #17731

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • Fix incorrect log truncations caused by delayed replication requests.

ztlpn added 3 commits April 16, 2024 20:19
It is similar to for_each_ref, but advances only if the consumer returns
ss::stop_iteration::no. I.e. the batch where the consumer stopped remains
available for reading by subsequent consumers.
Extract configurations using a wrapping batch consumer instead.
@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/47884#018ee860-7f02-47bd-90b3-3b3f37d26d1c:

"rptest.tests.cluster_bootstrap_test.ClusterBootstrapUpgrade.test_change_bootstrap_configs_after_upgrade.empty_seed_starts_cluster=False"

@vbotbuildovich
Copy link
Collaborator

@ztlpn ztlpn force-pushed the raft-fix-delayed-requests branch from 09b8df7 to 49d13fd Compare April 17, 2024 13:04
@ztlpn ztlpn added this to the 24.1 milestone Apr 17, 2024
@emaxerrno
Copy link
Contributor

oh this is cool.

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat fix, mostly nits, makes sense to me.

co_return reply;
}

co_return co_await do_append_entries(std::move(r));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only deviation is this do_append_entries is out of the try block now (which is still logically same).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is a bit weird that we printed an exception in the recursive call as a "truncation failure". OTOH it is written to not throw exceptions, so probably not a big difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Had the same question

"current state: {}",
batch_prev_log_index,
last_matched,
meta());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe useful to log current log offset state (lstats)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO what we have in meta() should be mostly enough (commit_index and dirty_offset are there)

struct find_mismatch_consumer {
ss::future<ss::stop_iteration>
operator()(const model::record_batch& b) {
model::offset last_offset = last_matched
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this method in record_batch()

model::offset last_offset() const { return _header.last_offset(); 

I"m curious if we can use that instead of computing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I used initially :) But I forgot that normally replicated batches don't have base_offset set yet (even though recovery batches do!) and we have to calculate the offsets manually. This is actually pretty confusing, I wonder if we should add some non-serialized flag to the batch header indicating that the offsets are still not set.

src/v/raft/consensus.cc Outdated Show resolved Hide resolved
src/v/raft/consensus.cc Outdated Show resolved Hide resolved
@ztlpn
Copy link
Contributor Author

ztlpn commented Apr 17, 2024

Ok, 300 iterations of the test_with_relaxed_acks suite passed successfully (before the fix it failed after a few dozen runs).

mmaslankaprv
mmaslankaprv previously approved these changes Apr 18, 2024
Copy link
Member

@mmaslankaprv mmaslankaprv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great !

This is important for the case when we already have _all_ batches locally
(possible if e.g. the request was delayed/duplicated). In this case we don't
want to truncate, otherwise we might lose already committed data.

Fixes redpanda-data#17731
@ztlpn
Copy link
Contributor Author

ztlpn commented Apr 18, 2024

test failure is #17847 (and some result publishing woes)

@ztlpn ztlpn merged commit ff870e1 into redpanda-data:dev Apr 18, 2024
17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17895-v23.3.x-741 remotes/upstream/v23.3.x
git cherry-pick -x 9753c9a30874dd24f52bc7ae4756bcfb191fb75e 93428112eb256a4daabe107489f33bc6358bfa14 2f432c23d35be188b7f0ccbe1cc00b4f6f7d653a f0c5772188dcd1c25f169e97dcf1ac802ae991be

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17895-v23.2.x-330 remotes/upstream/v23.2.x
git cherry-pick -x 9753c9a30874dd24f52bc7ae4756bcfb191fb75e 93428112eb256a4daabe107489f33bc6358bfa14 2f432c23d35be188b7f0ccbe1cc00b4f6f7d653a f0c5772188dcd1c25f169e97dcf1ac802ae991be

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI Failure (Assert failure) in gtest_raft_rpunit.test_with_relaxed_acks
6 participants