Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v24.1.x] rptest: reduce cache eviction throttling for space leak test #24070

Merged

Conversation

vbotbuildovich
Copy link
Collaborator

Backport of PR #22796
Fixes: #24069,

This test is too slow with default configuration making the test flaky.
Instead of raising the timeouts I'm trying to reduce the cache eviction
throttling which makes the test 3x faster.

The test became flaky after in-memory trim was introduced in
redpanda-data#21556.

The main insight was provided by https://github.com/abhijat in a private
exchange:

> I think it might be the extra throttling. With the carry over
> disabled, we always have to do a trim when reserving space, which
> results in a lot more throttling and sleep:
>
> ```
> $ grep -Ri "Cache trimming throttled" * | grep -c cache
> 139
> ```
>
> With the carryover list in place, about half of the calls to reserve
> space end up in an early return because the list provides enough room
> to clear up space, which does not cause the trimming to be throttled
> as much:
>
> ```
> $ grep -Ri "Cache trimming throttled" * | grep -c cache
> 63
> ```
>
> Although that doesn't explain how this test used to work before, IIRC
> carryover is a fairly new feature

Fixes redpanda-data#21597

(cherry picked from commit 7763669)
@vbotbuildovich vbotbuildovich added this to the v24.1.x-next milestone Nov 8, 2024
@vbotbuildovich vbotbuildovich added the kind/backport PRs targeting a stable branch label Nov 8, 2024
@nvartolomei nvartolomei enabled auto-merge November 8, 2024 08:25
@nvartolomei
Copy link
Contributor

nvartolomei commented Nov 8, 2024

Backporting because this fails often in my attempts to backport other PRs which is time consuming.

@nvartolomei nvartolomei self-assigned this Nov 8, 2024
@vbotbuildovich
Copy link
Collaborator Author

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57827#01930b1e-2871-4307-b535-0c0a408f387b:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.S3"

@vbotbuildovich
Copy link
Collaborator Author

Retry command for Build#57827

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/topic_recovery_test.py::TopicRecoveryTest.test_time_based_retention@{"cloud_storage_type":1}

@lf-rep lf-rep disabled auto-merge November 8, 2024 12:35
@lf-rep
Copy link
Contributor

lf-rep commented Nov 8, 2024

Force merge by Nicolae request

@lf-rep lf-rep merged commit bcba14c into redpanda-data:v24.1.x Nov 8, 2024
12 of 16 checks passed
@BenPope BenPope modified the milestones: v24.1.x-next, v24.1.18 Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backport PRs targeting a stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants