Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-level chunk cache #6249

Merged

Conversation

SungJin1212
Copy link
Contributor

@SungJin1212 SungJin1212 commented Oct 2, 2024

Support multi-level chunk cache like a multi-level cache for the index and add metrics for tracking multi-level cache behavior.

  • cortex_store_multilevel_chunks_cache_fetch_duration_seconds, tracks latency to fetch item
  • cortex_store_multilevel_chunks_cache_backfill_duration_seconds, tracks latency to backfill item
  • cortex_store_multilevel_chunks_cache_backfill_dropped_items_total, tracks # of dropped items due to buffer fullness when backfilling
  • cortex_store_multilevel_chunks_cache_store_dropped_items_total, tracks # of dropped items due to buffer fullness when storing

Add a multi level chunk cache
Which issue(s) this PR fixes:
Fixes #6240

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch 2 times, most recently from f89b7d2 to f92795d Compare October 2, 2024 23:42
@yeya24
Copy link
Contributor

yeya24 commented Oct 4, 2024

Can we fix tests?

@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch 5 times, most recently from 8663f64 to 0446dfe Compare October 4, 2024 11:35
@SungJin1212
Copy link
Contributor Author

@yeya24
I have fixed it.

@yeya24
Copy link
Contributor

yeya24 commented Oct 6, 2024

Hi @SungJin1212, I tested your PR locally but it got a panic when registering metrics.

My store gateway set up has metadata cache to use memcached and chunks cache to use multi level cache: inmemory as first level and memcached as second level.

I think the panic was caused by the new level label used by the chunks cache but the same metric in metadata cache doesn't have that. Prometheus registry expects the same label set for the same metric name.

panic: a previously registered descriptor with the same fully-qualified name as Desc{fqName: "thanos_cache_memcached_requests_total", help: "Total number of items requests to memcached.", constLabels: {component="store-gateway",name="metadata-cache"}, variableLabels: {}} has different label names or a different help string

goroutine 1 [running]:
.../github.com/prometheus/client_golang/prometheus.(*wrappingRegisterer).MustRegister(0xc001413d70, {0xc001285800?, 0x0?, 0x0?})

@SungJin1212
Copy link
Contributor Author

SungJin1212 commented Oct 6, 2024

@yeya24
Thank you for the review. I also reproduced it in my local. How about adding level label to metadata cache and just attach dummy label value like tenancy.DefaultTenant?

@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 0446dfe to fa78fa2 Compare October 6, 2024 04:38
@SungJin1212
Copy link
Contributor Author

@yeya24
I removed a level label to chunk cache.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The feature looks good. I have tried it in our test cluster and it saved quite a lot of Store Gateway and Chunks Cache bandwidth, especially when you have rules with a long lookback window.

pkg/storage/tsdb/multilevel_chunk_cache.go Outdated Show resolved Hide resolved
pkg/storage/tsdb/multilevel_chunk_cache.go Outdated Show resolved Hide resolved
@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from fa78fa2 to 7add2ef Compare October 7, 2024 05:14
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Waiting for another approval before merging this.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 7, 2024
@yeya24
Copy link
Contributor

yeya24 commented Oct 8, 2024

We have a conflict seems aftering merging the previous PR

@alanprot
Copy link
Member

alanprot commented Oct 8, 2024

LGTM!

@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 7add2ef to 5cfcfd7 Compare October 8, 2024 22:37
@yeya24
Copy link
Contributor

yeya24 commented Oct 8, 2024

@SungJin1212 Need to fix lint

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 5cfcfd7 to 7947e88 Compare October 8, 2024 22:57
@yeya24 yeya24 merged commit d08f93b into cortexproject:master Oct 9, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size/XL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Store Gateway: multi level chunks cache
3 participants