Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (TimeoutError) in TopicDeleteCloudStorageTest.topic_delete_unavailable_test #15479

Closed
vbotbuildovich opened this issue Dec 13, 2023 · 33 comments
Labels
area/cloud-storage Shadow indexing subsystem auto-triaged used to know which issues have been opened from a CI job ci-failure sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.

Comments

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 13, 2023

https://buildkite.com/redpanda/redpanda/builds/42333
https://buildkite.com/redpanda/redpanda/builds/42565
https://buildkite.com/redpanda/redpanda/builds/42656

Module: rptest.tests.topic_delete_test
Class: TopicDeleteCloudStorageTest
Method: topic_delete_unavailable_test
Arguments: {
    "cloud_storage_type": 2
}
test_id:    TopicDeleteCloudStorageTest.topic_delete_unavailable_test
status:     FAIL
run time:   61.255 seconds

TimeoutError('')
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 269, in run_test
    return self.test_context.function(self.test)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/root/tests/rptest/services/cluster.py", line 82, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/topic_delete_test.py", line 625, in topic_delete_unavailable_test
    wait_until(lambda: topic_storage_purged(self.redpanda, self.topic),
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError

JIRA Link: CORE-1636

@vbotbuildovich vbotbuildovich added auto-triaged used to know which issues have been opened from a CI job ci-failure labels Dec 13, 2023
@michael-redpanda michael-redpanda changed the title CI Failure (key symptom) in TopicDeleteCloudStorageTest.topic_delete_unavailable_test CI Failure (TimeoutError) in TopicDeleteCloudStorageTest.topic_delete_unavailable_test Dec 14, 2023
@michael-redpanda michael-redpanda added the area/cloud-storage Shadow indexing subsystem label Dec 14, 2023
@abhijat abhijat added the sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low. label Jan 2, 2024
@abhijat
Copy link
Contributor

abhijat commented Jan 2, 2024

The partition removal gets stuck somewhere when stopping the archiver, during partition::stop:

DEBUG 2023-12-06 18:06:10,238 [shard 1:main] cluster - partition.cc:563 - Stopping partition: {kafka/topic-cgdoprhslw/2}
DEBUG 2023-12-06 18:06:10,238 [shard 1:main] cluster - partition.cc:578 - Stopping archiver on partition: {kafka/topic-cgdoprhslw/2}
...
DEBUG 2023-12-06 18:06:43,082 [shard 1:main] cluster - partition.cc:587 - Stopping cloud_storage_partition on partition: {kafka/topic-cgdoprhslw/2}
DEBUG 2023-12-06 18:06:43,082 [shard 1:main] cluster - partition.cc:595 - Stopping cloud_storage_manifest_view on partition: {kafka/topic-cgdoprhslw/2}

As storage is removed only after the preceding steps finish, the segments are not removed and there is a timeout.

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

2 similar comments
@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@vbotbuildovich
Copy link
Collaborator Author

@piyushredpanda
Copy link
Contributor

Closing older-bot-filed CI issues as we transition to a more reliable system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem auto-triaged used to know which issues have been opened from a CI job ci-failure sev/medium Bugs that do not meet criteria for high or critical, but are more severe than low.
Projects
None yet
Development

No branches or pull requests

4 participants