-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for CI Failure (Failed bounds check: offset 1823 not consumable) in DeleteRecordsTest.test_delete_records_segment_deletion #14434
Fix for CI Failure (Failed bounds check: offset 1823 not consumable) in DeleteRecordsTest.test_delete_records_segment_deletion #14434
Conversation
- Adds a retry when calling rpk consume, it may be the case due to a concurrent node failure that rpk will not retry and return an error, causing the test to fail. - Fixes: redpanda-data#14151
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should RPK itself have retries built in?
Not really sure, I asked Travis, here is the relevant thread https://redpandadata.slack.com/archives/C04LYNJQUGN/p1697207759960509 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small nit otherwise LGTM
except Exception as _: | ||
return False | ||
retries = 3 | ||
while retries > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the wait until helper here and in the other place?
def check_bounds():
r = self.rpk.consume(...)
return r.count('_') == 1
wait_until(
check_bounds,
timeout_sec=5,
backoff_sec=1,
err_msg=f"rob's awesome error message here",
retry_on_exc=True,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with retry_on_exc
we can , sure
new failures detected in https://buildkite.com/redpanda/redpanda/builds/39805#018b688d-0e6c-4c60-bca0-f0773581563f: "rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.S3" |
#13547 seems to be the failure. |
/backport v23.2.x |
Failed to create a backport PR to v23.2.x branch. I tried:
|
Something wrong happened here, I thought I force pushed guess i didn't then work that wasn't ready was merged |
I will open a new PR |
Happy to review 👍 |
- PR redpanda-data#14434 was merged too early before edits were made, this change represents the intended edits that were not force-pushed before merge.
- PR redpanda-data#14434 was merged too early before edits were made, this change represents the intended edits that were not force-pushed before merge. (cherry picked from commit 1f12a7f)
- PR redpanda-data#14434 was merged too early before edits were made, this change represents the intended edits that were not force-pushed before merge. (cherry picked from commit 1f12a7f)
Adds a retry when calling rpk consume, it may be the case due to a concurrent node failure that rpk will not retry and return an error, causing the test to fail.
Fixes: CI Failure (Failed bounds check: offset 1823 not consumable) in
DeleteRecordsTest.test_delete_records_segment_deletion
#14151Backports Required
Release Notes