Skip to content

Commit

Permalink
rabbit_quorum_queue: Wait for member add in add_member/4
Browse files Browse the repository at this point in the history
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.

[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.

An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`

(cherry picked from commit 99d8e90)
  • Loading branch information
dumbbell authored and mergify[bot] committed Nov 28, 2024
1 parent 83da9fe commit 8ce543a
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion deps/rabbit/src/rabbit_quorum_queue.erl
Original file line number Diff line number Diff line change
Expand Up @@ -1346,14 +1346,27 @@ add_member(Q, Node, Membership, Timeout) when ?amqqueue_is_quorum(Q) ->
maps:get(id, Conf)
end,
case ra:add_member(Members, ServerIdSpec, Timeout) of
{ok, _, Leader} ->
{ok, {RaIndex, RaTerm}, Leader} ->
Fun = fun(Q1) ->
Q2 = update_type_state(
Q1, fun(#{nodes := Nodes} = Ts) ->
Ts#{nodes => [Node | Nodes]}
end),
amqqueue:set_pid(Q2, Leader)
end,
%% The `ra:member_add/3` call above returns before the
%% change is committed. This is ok for that addition but
%% any follow-up changes to the cluster might be rejected
%% with the `cluster_change_not_permitted` error.
%%
%% Instead of changing other places to wait or retry their
%% cluster membership change, we wait for the current add
%% to be applied using a conditional leader query before
%% proceeding and returning.
{ok, _, _} = ra:leader_query(
Leader,
{erlang, is_list, []},
#{condition => {applied, {RaIndex, RaTerm}}}),
_ = rabbit_amqqueue:update(QName, Fun),
rabbit_log:info("Added a replica of quorum ~ts on node ~ts", [rabbit_misc:rs(QName), Node]),
ok;
Expand Down

0 comments on commit 8ce543a

Please sign in to comment.