[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 #12073 #12075

mmaslankaprv · 2023-07-13T09:18:59Z

Backport of PRs:

Fixes: #10610
Fixes: #10611
Fixes: #11014
Fixes: #11280
Fixes: #11405
Fixes: #11967
Fixes: #11975
Fixes: #11977
Fixes: #11979
Fixes: #12011
Fixes: #12073
Fixes: #12011
Fixes: #11888

Backports Required

piyushredpanda · 2023-07-18T14:52:55Z

/ci-repeat 1

mmaslankaprv · 2023-07-19T05:39:37Z

/ci-repeat 1

ztlpn · 2023-07-19T13:14:35Z

hmm, also contains commits from #11597 but looks like it is already backported?

mmaslankaprv · 2023-07-19T13:21:55Z

ci falilure: #12310

mmaslankaprv · 2023-07-19T14:08:04Z

The failure is not a regression

src/v/cluster/metadata_dissemination_types.h

src/v/cluster/node_status_backend.cc

src/v/cluster/controller.h

src/v/redpanda/admin_server.cc

src/v/kafka/server/replicated_partition.h

src/v/raft/consensus.cc

Followup from redpanda-data#10810, this moves values that were default constructed when reading json. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com> (cherry picked from commit 5906483)

Added Admin API retries on timeout to nodes decommissioning test. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit d59b9a7)

Since all decommissioning tests base on the assumption that the decommissioning operation was successful added a safe versions of decommission/recommission operations that will retry if it is required. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit f842a90)

When configuration is being applied we gate updates with a `_last_seen_version` of a configuration frontend. Previously the version was updated after the configuration update was applied to the `configration_manager` leading to a situation in which the configuration was overridden by the previous update. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7771db9)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 53cf8dc)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit e7ccef2)

Made documentation clear about what is being returned for each API and matched the declared returned type with the one that actually is being returned. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 52722e3)

Sometimes controller log dirty offset may be helpful to understand the gap between what is know to be committed and what is available in the log. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit ed47391)

Controller erasure test is supposed to validate if there is a mismatch between the last appended entry in kvstore and controller max offset. In order for the test to work correctly we must wait for all the messages to be committed as we only delete the last segment that contains a single message (new replicated configuration). In order to make the test reliable change the condition to wait for the applied offset on the node where controller log is going to be removed to be equal to the leader dirty offset. Fixes: redpanda-data#8217 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 57fb4c0)

In Kafka the semantics of `log_end_offset` is defined as a next offset assigned to a record produced to a given partition. When local log is empty its `dirty_offset` is equal to `start_offset - 1`. In this case a `log_end_offset` should return Kafka offset corresponding to the `start_offset` as this is the next offset assigned to a record produced to given topic partition. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 1d682b9)

Added test that validates if a consumer is able to continue consuming a log that has been completely removed by delete retention. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 89a8d97)

When forcibly aborting reconfiguration we should wait for the new leader to be elected in the configuration that the partition was forced to. This way we can be certain that the new configuration will finally be replicated to the majority of nodes even tough the leader may not exists at the time when configuration is replicated. Fixes: redpanda-data#9243 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 3948312)

Change the `raft::state_machine::apply()` to always read only committed entries. Previously it might happen that with `acks=1` some of the entries that were not yet committed were applied to the stm. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 1aaf98f)

…pshot` Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit a8e1a59)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 4daab41)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit baabe8f)

If `last_applied` offset persisted in kvstore is greater than the log dirty offset it indicates the inconsistency. The inconsistency may be a result of intentional removal of a log segments from Redpanda data directory. In this scenario the `last_applied` offset must be removed from kv-store to prevent it from updating the committed offset which may result in not committed batches being applied to stm. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 8989a1a)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 114daef)

Deltas were previously stored as a vector per `ntp`. Deltas access pattern (iteration, inserting and popping elements from back and from the end) makes it perfect candidate for `std::list` usage. The `std::deque` doesn't use large contiguous allocation so will not account for the memory fragmentation. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit da0958c)

Using `ss::chunked_fifo` to return deltas processed by controller backend. Previously used `std::vector` may lead to large allocations as it allocated large chunks of contiguous memory. Fixes: redpanda-data#11673 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit fddcb30)

Made entries indicating receiving append entries and vote request more obvious. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 4ff810b)

Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7d68e46)

When a voter receives vote request and it votes for the candidate it updates the last heartbeat timeout. If this happens during the prevote phase and in a deployment with even number of locks it may lead to temporary live lock and not being able to elect the leader. Fixes: redpanda-data#11657 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 995c775)

Added test verifying if a controller is elected in timely fashion when some of the cluster nodes are down. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit c6555b6)

When linearizable barrier is requested we want follower to flush its log to make sure that all possible entries are committed with traditional raft semantics. Added handling of flushing log on the follower if leader requested it and append entries request is empty. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 9cd72e5)

When linearizable barrier is set we want to move committed offset forward. In this case followers must flush their offsets to allow leader committing its entries. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit f936b6d)

For the STM linearizable barrier to make sense we must wait for the offset to be applied to the stm. Otherwise the linearizable barrier gives no guarantees about the state machine state. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 3bcf038)

In order to prevent contention implemented sharing linearizable barrier result between contended callers. Instead of calling linearizable barrier multiple times a caller will wait for the result of a barrier that is already being executed. This doesn't change the current semnatics of linearizable barrier as either way a caller must check the returned offset if they want to wait for the whole history to be applied. Sharing results helps in a situation where multiple parallel fibers try to setup linearizable barrier. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7cb919c)

An upstream Kafka validates if fetch offset is between start offset and log end offset. Fixed validation in Redpanda as we were validating the end of range with high watermark. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit bd73f10)

- This change reverts a previous attempt to fix rm_stm reporting invalid_lso (in change 20285df), which sets `booststrap_committed_offset` in apply_snapshot() to fix the case where invalid_lso is indefinitely returned on a node that had just restarted, applied a snapshot, but no data was produced onto it. - The change was however incorrect, as the bootstrap committed offset is expected to be the value of a complete read of the log, and the value of the offset in the rm_stm snapshot at the time of reading it at startup, does not necessarily reflect this. - The solution is to at startup wait until the consensus later has modified the committed offset. (cherry picked from commit 88cfa56)

andrewhsu · 2023-08-11T21:27:07Z

fyi i updated the PR description to remove the Release Notes section since this is a backport PR so rpchangelog will use the referenced PRs in the list to generate release notes lines.

github-actions bot added the area/redpanda label Jul 13, 2023

mmaslankaprv force-pushed the v23.1.x-backports branch from b76f440 to e91ddd4 Compare July 13, 2023 12:33

mmaslankaprv changed the title ~~[v23.1.x] Backport~~ [v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 Jul 13, 2023

mmaslankaprv requested a review from dotnwat July 13, 2023 14:12

mmaslankaprv force-pushed the v23.1.x-backports branch 2 times, most recently from 3b4409d to 35b996b Compare July 14, 2023 05:54

mmaslankaprv requested a review from bharathv July 14, 2023 05:54

mmaslankaprv changed the title ~~[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860~~ [v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 #12073 Jul 17, 2023

BenPope added the kind/backport PRs targeting a stable branch label Jul 18, 2023

BenPope added this to the v23.1.x-next milestone Jul 18, 2023