Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rabbit_quorum_queue: Wait for member add in add_member/4 #12837

Merged

Conversation

dumbbell
Copy link
Member

@dumbbell dumbbell commented Nov 27, 2024

Why

The ra:member_add/3 call returns before the change is committed. This is ok for that addition but any follow-up changes to the cluster might be rejected with the cluster_change_not_permitted error.

How

Instead of changing other places to wait or retry their cluster membership change, this patch waits for the current add to be applied before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes are rejected and not retried, leaving the cluster in an unexpected state for the testcase.

An example is with quorum_queue_SUITE:force_shrink_member_to_current_member/1

@dumbbell dumbbell requested a review from kjnilsson November 27, 2024 16:45
@dumbbell dumbbell self-assigned this Nov 27, 2024
@dumbbell dumbbell force-pushed the wait-for-commit-in-rabbit_quorum_queue-add_member branch from 2aeade5 to b51fcf2 Compare November 27, 2024 16:46
@dumbbell dumbbell marked this pull request as ready for review November 27, 2024 16:46
@dumbbell dumbbell force-pushed the wait-for-commit-in-rabbit_quorum_queue-add_member branch from b51fcf2 to 2402d3d Compare November 27, 2024 16:51
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.

[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.

An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
@dumbbell dumbbell force-pushed the wait-for-commit-in-rabbit_quorum_queue-add_member branch from 2402d3d to 99d8e90 Compare November 28, 2024 10:27
@michaelklishin michaelklishin added this to the 4.1.0 milestone Nov 28, 2024
@michaelklishin michaelklishin merged commit d6366a3 into main Nov 28, 2024
271 checks passed
@michaelklishin michaelklishin deleted the wait-for-commit-in-rabbit_quorum_queue-add_member branch November 28, 2024 19:34
michaelklishin added a commit that referenced this pull request Nov 28, 2024
rabbit_quorum_queue: Wait for member add in `add_member/4` (backport #12837)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants