Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CQ: Make CQ shared store compaction fast #10696

Merged
merged 6 commits into from
Mar 11, 2024
Merged

Conversation

lhoguin
Copy link
Contributor

@lhoguin lhoguin commented Mar 7, 2024

This PR does 3 things:

  • Don't compact shared message store files too eagerly: if a file is about to be compacted AND is scheduled to be compacted 15s from now, we wait 15s before compacting.
  • Restore the use of file scanning instead of ets scanning for finding messages to be compacted, in the file. Ideally this would be a secondary ets key, but we don't have that, so using ets is too slow.
  • Rework the file scanning to make sure we never forget a file (as the old 3.12 algorithm does not work well with compaction being a thing) and add many tests for various likely and unlikely scenarios.

Types of Changes

@michaelklishin michaelklishin changed the title CQ: Defer shared store GC when removes were observed DO NOT MERGE CQ: Defer shared store GC when removes were observed Mar 7, 2024
@essen essen force-pushed the loic-cq-defer-gc-active-files branch from 961da12 to f586d3a Compare March 8, 2024 10:27
@lhoguin lhoguin changed the title DO NOT MERGE CQ: Defer shared store GC when removes were observed DO NOT MERGE CQ: Make CQ shared store compaction fast again Mar 8, 2024
@lhoguin lhoguin changed the title DO NOT MERGE CQ: Make CQ shared store compaction fast again DO NOT MERGE CQ: Make CQ shared store compaction fast Mar 8, 2024
@mergify mergify bot added the bazel label Mar 8, 2024
@essen essen force-pushed the loic-cq-defer-gc-active-files branch from b24763f to aaf76aa Compare March 11, 2024 12:00
@lhoguin lhoguin changed the title DO NOT MERGE CQ: Make CQ shared store compaction fast CQ: Make CQ shared store compaction fast Mar 11, 2024
@essen essen force-pushed the loic-cq-defer-gc-active-files branch from aaf76aa to 74b9811 Compare March 11, 2024 12:17
@lhoguin lhoguin marked this pull request as ready for review March 11, 2024 12:30
@lhoguin lhoguin requested a review from mkuratczyk March 11, 2024 12:30
@mkuratczyk
Copy link
Contributor

Test results: https://grafana.lionhead.rabbitmq.com/goto/3Bgb8QASR?orgId=1

While there are other differences - some small performance regressions and some improvements, the most important is that there are no more memory spikes like this:
Screenshot 2024-03-11 at 15 59 14

This spikes correlate with the issue #10681 (at least in my testing) since they are a caused by a backlog of compaction requests.

Other notable differences:

  1. The publishing throughput is significantly higher when consumers try to catch up on a backlog of messages. Previously they would immediately trigger a lot of compaction, which is now delayed, leaving more resources for publishers (the part of the graph where 3.13/blue drops to almost zero):
    Screenshot 2024-03-11 at 16 04 57

  2. On the other hand, because there's more compaction work performed later, publisher throughput is lower (second part of the graph above).

@lhoguin
Copy link
Contributor Author

lhoguin commented Mar 11, 2024

Thank you! Better compaction scheduling is something we can improve at a later time.

@lhoguin lhoguin merged commit 243064b into main Mar 11, 2024
19 checks passed
@lhoguin lhoguin deleted the loic-cq-defer-gc-active-files branch March 11, 2024 16:10
lhoguin added a commit that referenced this pull request Mar 11, 2024
CQ: Make shared store compaction fast (backport #10696)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants