Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (timeout on _partitions_moving) in NodesDecommissioningTest.test_decommissioning_rebalancing_node #12220

Closed
Lazin opened this issue Jul 18, 2023 · 2 comments · Fixed by #13616

Comments

@Lazin
Copy link
Contributor

Lazin commented Jul 18, 2023

https://buildkite.com/redpanda/redpanda/builds/31635
https://buildkite.com/redpanda/redpanda/builds/32377
https://buildkite.com/redpanda/vtools/builds/8412
https://buildkite.com/redpanda/redpanda/builds/34531
https://buildkite.com/redpanda/redpanda/builds/35487
https://buildkite.com/redpanda/redpanda/builds/35952
https://buildkite.com/redpanda/redpanda/builds/36489
https://buildkite.com/redpanda/redpanda/builds/36563

Module: rptest.tests.nodes_decommissioning_test
Class: NodesDecommissioningTest
Method: test_decommissioning_rebalancing_node
Arguments: {
    "shutdown_decommissioned": false
}
test_id:    NodesDecommissioningTest.test_decommissioning_rebalancing_node
status:     FAIL
run time:   113.197 seconds

TimeoutError('')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/root/tests/rptest/services/cluster.py", line 79, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/nodes_decommissioning_test.py", line 554, in test_decommissioning_rebalancing_node
    wait_until(lambda: self._partitions_moving(node=first_node),
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError
@Lazin Lazin added ci-failure ci-ignore Automatic ci analysis tools ignore this issue labels Jul 18, 2023
@Lazin Lazin closed this as completed Jul 18, 2023
@BenPope
Copy link
Member

BenPope commented Jul 24, 2023

@Lazin in what way was this fixed? Is there a PR?

@BenPope
Copy link
Member

BenPope commented Aug 16, 2023

@BenPope BenPope reopened this Aug 16, 2023
@BenPope BenPope changed the title CI Failure (key symptom) in NodesDecommissioningTest.test_decommissioning_rebalancing_node CI Failure (TimeoutError) in NodesDecommissioningTest.test_decommissioning_rebalancing_node Sep 7, 2023
@BenPope BenPope changed the title CI Failure (TimeoutError) in NodesDecommissioningTest.test_decommissioning_rebalancing_node CI Failure TimeoutError('') in NodesDecommissioningTest.test_decommissioning_rebalancing_node Sep 7, 2023
@BenPope BenPope removed the ci-ignore Automatic ci analysis tools ignore this issue label Sep 7, 2023
@redpanda-data redpanda-data deleted a comment from BenPope Sep 7, 2023
@rystsov rystsov changed the title CI Failure TimeoutError('') in NodesDecommissioningTest.test_decommissioning_rebalancing_node CI Failure (timeout on _partitions_moving) in NodesDecommissioningTest.test_decommissioning_rebalancing_node Sep 7, 2023
@mmaslankaprv mmaslankaprv self-assigned this Sep 22, 2023
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Sep 22, 2023
It may happen that the health report notification for a given node is
processed before the notification about the node addition. In this case
`partition_balancer_backend` would wait for another tick to trigger
rebalancing. Added immediate tick trigger when node add notification is
processed and health report for a node is already present.

Fixes: redpanda-data#12220

Signed-off-by: Michal Maslanka <michal@redpanda.com>
mmaslankaprv added a commit to mmaslankaprv/redpanda that referenced this issue Sep 22, 2023
It may happen that the health report notification for a given node is
processed before the notification about the node addition. In this case
`partition_balancer_backend` would wait for another tick to trigger
rebalancing. Added immediate tick trigger when node add notification is
processed and health report for a node is already present.

Fixes: redpanda-data#12220

Signed-off-by: Michal Maslanka <michal@redpanda.com>
vbotbuildovich pushed a commit to vbotbuildovich/redpanda that referenced this issue Sep 25, 2023
It may happen that the health report notification for a given node is
processed before the notification about the node addition. In this case
`partition_balancer_backend` would wait for another tick to trigger
rebalancing. Added immediate tick trigger when node add notification is
processed and health report for a node is already present.

Fixes: redpanda-data#12220

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 75a1011)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants