Skip to content

Commit

Permalink
r/recovery_stm: stop recovery when follower was already updated
Browse files Browse the repository at this point in the history
Recovery and replicate stms are not synchronized. It may be the case
when both of stms are active at the same time that the same batch is
delivered to the follower twice. In general this batch duplication is
harmless as Raft is not vulnerable for messages redelivery but it may
cause unnecessary truncation and latency increase.

Added a check validating expected log end offset right before sending
recovery append entries request. This will prevent sending the same set
of batches twice to the follower.

Fixes: #14413

Signed-off-by: Michal Maslanka <michal@redpanda.com>
  • Loading branch information
mmaslankaprv committed Nov 20, 2023
1 parent bf9a11d commit f10017d
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions src/v/raft/recovery_stm.cc
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,18 @@ ss::future<> recovery_stm::replicate(
_stop_requested = true;
return ss::now();
}
if (meta.value()->expected_log_end_offset >= _last_batch_offset) {
vlog(
_ctxlog.trace,
"follower expected log end offset is already updated, stopping "
"recovery. Expected log end offset: {}, recovery range last offset: "
"{}",
meta.value()->expected_log_end_offset,
_last_batch_offset);

_stop_requested = true;
return ss::now();
}
/**
* Update follower expected log end. It is equal to the last batch in a set
* of batches read for this recovery round.
Expand Down

0 comments on commit f10017d

Please sign in to comment.