-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 #12073 #12075
Merged
mmaslankaprv
merged 38 commits into
redpanda-data:v23.1.x
from
mmaslankaprv:v23.1.x-backports
Jul 20, 2023
Merged
[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 #12073 #12075
mmaslankaprv
merged 38 commits into
redpanda-data:v23.1.x
from
mmaslankaprv:v23.1.x-backports
Jul 20, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mmaslankaprv
force-pushed
the
v23.1.x-backports
branch
from
July 13, 2023 12:33
b76f440
to
e91ddd4
Compare
mmaslankaprv
changed the title
[v23.1.x] Backport
[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860
Jul 13, 2023
mmaslankaprv
force-pushed
the
v23.1.x-backports
branch
2 times, most recently
from
July 14, 2023 05:54
3b4409d
to
35b996b
Compare
mmaslankaprv
changed the title
[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860
[v23.1.x] Backport of #9380 #8750 #10923 #11164 #11350 #11838 #11905 #11840 #11691 #11726 #10860 #12073
Jul 17, 2023
/ci-repeat 1 |
1 similar comment
/ci-repeat 1 |
hmm, also contains commits from #11597 but looks like it is already backported? |
ci falilure: #12310 |
The failure is not a regression |
ztlpn
reviewed
Jul 19, 2023
ztlpn
reviewed
Jul 19, 2023
ztlpn
reviewed
Jul 19, 2023
ztlpn
reviewed
Jul 19, 2023
ztlpn
reviewed
Jul 19, 2023
ztlpn
reviewed
Jul 19, 2023
mmaslankaprv
force-pushed
the
v23.1.x-backports
branch
from
July 20, 2023 06:33
f128fae
to
74b82bf
Compare
Followup from redpanda-data#10810, this moves values that were default constructed when reading json. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com> (cherry picked from commit 5906483)
Added Admin API retries on timeout to nodes decommissioning test. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit d59b9a7)
Since all decommissioning tests base on the assumption that the decommissioning operation was successful added a safe versions of decommission/recommission operations that will retry if it is required. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit f842a90)
When configuration is being applied we gate updates with a `_last_seen_version` of a configuration frontend. Previously the version was updated after the configuration update was applied to the `configration_manager` leading to a situation in which the configuration was overridden by the previous update. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7771db9)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 53cf8dc)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit e7ccef2)
Made documentation clear about what is being returned for each API and matched the declared returned type with the one that actually is being returned. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 52722e3)
Sometimes controller log dirty offset may be helpful to understand the gap between what is know to be committed and what is available in the log. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit ed47391)
Controller erasure test is supposed to validate if there is a mismatch between the last appended entry in kvstore and controller max offset. In order for the test to work correctly we must wait for all the messages to be committed as we only delete the last segment that contains a single message (new replicated configuration). In order to make the test reliable change the condition to wait for the applied offset on the node where controller log is going to be removed to be equal to the leader dirty offset. Fixes: redpanda-data#8217 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 57fb4c0)
In Kafka the semantics of `log_end_offset` is defined as a next offset assigned to a record produced to a given partition. When local log is empty its `dirty_offset` is equal to `start_offset - 1`. In this case a `log_end_offset` should return Kafka offset corresponding to the `start_offset` as this is the next offset assigned to a record produced to given topic partition. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 1d682b9)
Added test that validates if a consumer is able to continue consuming a log that has been completely removed by delete retention. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 89a8d97)
When forcibly aborting reconfiguration we should wait for the new leader to be elected in the configuration that the partition was forced to. This way we can be certain that the new configuration will finally be replicated to the majority of nodes even tough the leader may not exists at the time when configuration is replicated. Fixes: redpanda-data#9243 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 3948312)
Change the `raft::state_machine::apply()` to always read only committed entries. Previously it might happen that with `acks=1` some of the entries that were not yet committed were applied to the stm. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 1aaf98f)
…pshot` Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit a8e1a59)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 4daab41)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit baabe8f)
If `last_applied` offset persisted in kvstore is greater than the log dirty offset it indicates the inconsistency. The inconsistency may be a result of intentional removal of a log segments from Redpanda data directory. In this scenario the `last_applied` offset must be removed from kv-store to prevent it from updating the committed offset which may result in not committed batches being applied to stm. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 8989a1a)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 114daef)
Deltas were previously stored as a vector per `ntp`. Deltas access pattern (iteration, inserting and popping elements from back and from the end) makes it perfect candidate for `std::list` usage. The `std::deque` doesn't use large contiguous allocation so will not account for the memory fragmentation. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit da0958c)
Using `ss::chunked_fifo` to return deltas processed by controller backend. Previously used `std::vector` may lead to large allocations as it allocated large chunks of contiguous memory. Fixes: redpanda-data#11673 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit fddcb30)
Made entries indicating receiving append entries and vote request more obvious. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 4ff810b)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7d68e46)
When a voter receives vote request and it votes for the candidate it updates the last heartbeat timeout. If this happens during the prevote phase and in a deployment with even number of locks it may lead to temporary live lock and not being able to elect the leader. Fixes: redpanda-data#11657 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 995c775)
Added test verifying if a controller is elected in timely fashion when some of the cluster nodes are down. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit c6555b6)
When linearizable barrier is requested we want follower to flush its log to make sure that all possible entries are committed with traditional raft semantics. Added handling of flushing log on the follower if leader requested it and append entries request is empty. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 9cd72e5)
When linearizable barrier is set we want to move committed offset forward. In this case followers must flush their offsets to allow leader committing its entries. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit f936b6d)
For the STM linearizable barrier to make sense we must wait for the offset to be applied to the stm. Otherwise the linearizable barrier gives no guarantees about the state machine state. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 3bcf038)
In order to prevent contention implemented sharing linearizable barrier result between contended callers. Instead of calling linearizable barrier multiple times a caller will wait for the result of a barrier that is already being executed. This doesn't change the current semnatics of linearizable barrier as either way a caller must check the returned offset if they want to wait for the whole history to be applied. Sharing results helps in a situation where multiple parallel fibers try to setup linearizable barrier. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 7cb919c)
An upstream Kafka validates if fetch offset is between start offset and log end offset. Fixed validation in Redpanda as we were validating the end of range with high watermark. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit bd73f10)
mmaslankaprv
force-pushed
the
v23.1.x-backports
branch
from
July 20, 2023 06:52
74b82bf
to
289ac43
Compare
- This change reverts a previous attempt to fix rm_stm reporting invalid_lso (in change 20285df), which sets `booststrap_committed_offset` in apply_snapshot() to fix the case where invalid_lso is indefinitely returned on a node that had just restarted, applied a snapshot, but no data was produced onto it. - The change was however incorrect, as the bootstrap committed offset is expected to be the value of a complete read of the log, and the value of the offset in the rm_stm snapshot at the time of reading it at startup, does not necessarily reflect this. - The solution is to at startup wait until the consensus later has modified the committed offset. (cherry picked from commit 88cfa56)
ztlpn
approved these changes
Jul 20, 2023
fyi i updated the PR description to remove the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of PRs:
ss::chunked_fifo
in metadata disseminationupdate_leadership_request_v2
#10923cluster::node_status_backend
#11164log_end_offset
for empty local logs #11838acks=0
oracks=1
#11840chunked_fifo
to retrieve deltas fromcontroller_backend
#11691Fixes: #10610
Fixes: #10611
Fixes: #11014
Fixes: #11280
Fixes: #11405
Fixes: #11967
Fixes: #11975
Fixes: #11977
Fixes: #11979
Fixes: #12011
Fixes: #12073
Fixes: #12011
Fixes: #11888
Backports Required