Add multi-level chunk cache #6249

SungJin1212 · 2024-10-02T23:33:47Z

Support multi-level chunk cache like a multi-level cache for the index and add metrics for tracking multi-level cache behavior.

cortex_store_multilevel_chunks_cache_fetch_duration_seconds, tracks latency to fetch item
cortex_store_multilevel_chunks_cache_backfill_duration_seconds, tracks latency to backfill item
cortex_store_multilevel_chunks_cache_backfill_dropped_items_total, tracks # of dropped items due to buffer fullness when backfilling
cortex_store_multilevel_chunks_cache_store_dropped_items_total, tracks # of dropped items due to buffer fullness when storing

Add a multi level chunk cache
Which issue(s) this PR fixes:
Fixes #6240

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

yeya24 · 2024-10-04T06:26:39Z

Can we fix tests?

SungJin1212 · 2024-10-05T00:08:40Z

@yeya24
I have fixed it.

yeya24 · 2024-10-06T02:46:56Z

Hi @SungJin1212, I tested your PR locally but it got a panic when registering metrics.

My store gateway set up has metadata cache to use memcached and chunks cache to use multi level cache: inmemory as first level and memcached as second level.

I think the panic was caused by the new level label used by the chunks cache but the same metric in metadata cache doesn't have that. Prometheus registry expects the same label set for the same metric name.

panic: a previously registered descriptor with the same fully-qualified name as Desc{fqName: "thanos_cache_memcached_requests_total", help: "Total number of items requests to memcached.", constLabels: {component="store-gateway",name="metadata-cache"}, variableLabels: {}} has different label names or a different help string

goroutine 1 [running]:
.../github.com/prometheus/client_golang/prometheus.(*wrappingRegisterer).MustRegister(0xc001413d70, {0xc001285800?, 0x0?, 0x0?})

SungJin1212 · 2024-10-06T03:31:56Z

@yeya24
Thank you for the review. I also reproduced it in my local. How about adding level label to metadata cache and just attach dummy label value like tenancy.DefaultTenant?

SungJin1212 · 2024-10-06T04:39:08Z

@yeya24
I removed a level label to chunk cache.

yeya24

Thanks. The feature looks good. I have tried it in our test cluster and it saved quite a lot of Store Gateway and Chunks Cache bandwidth, especially when you have rules with a long lookback window.

pkg/storage/tsdb/multilevel_chunk_cache.go

yeya24

Thanks. Waiting for another approval before merging this.

yeya24 · 2024-10-08T16:59:42Z

We have a conflict seems aftering merging the previous PR

alanprot · 2024-10-08T21:31:56Z

LGTM!

yeya24 · 2024-10-08T22:51:27Z

@SungJin1212 Need to fix lint

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

pull-request-size bot added the size/XL label Oct 2, 2024

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch 2 times, most recently from f89b7d2 to f92795d Compare October 2, 2024 23:42

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch 5 times, most recently from 8663f64 to 0446dfe Compare October 4, 2024 11:35

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 0446dfe to fa78fa2 Compare October 6, 2024 04:38

yeya24 reviewed Oct 7, 2024

View reviewed changes

pkg/storage/tsdb/multilevel_chunk_cache.go Outdated Show resolved Hide resolved

pkg/storage/tsdb/multilevel_chunk_cache.go Outdated Show resolved Hide resolved

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from fa78fa2 to 7add2ef Compare October 7, 2024 05:14

yeya24 approved these changes Oct 7, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 7, 2024

alanprot approved these changes Oct 8, 2024

View reviewed changes

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 7add2ef to 5cfcfd7 Compare October 8, 2024 22:37

Add multi-level chunk cache

7947e88

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

SungJin1212 force-pushed the Add-multilevel-chunk-cache branch from 5cfcfd7 to 7947e88 Compare October 8, 2024 22:57

yeya24 merged commit d08f93b into cortexproject:master Oct 9, 2024
16 checks passed

This was referenced Oct 14, 2024

Reduce default multilevel index cache max async concurrency to 3 #6265

Merged

Change all max async concurrency #6268

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-level chunk cache #6249

Add multi-level chunk cache #6249

SungJin1212 commented Oct 2, 2024 •

edited

Loading

yeya24 commented Oct 4, 2024 •

edited

Loading

SungJin1212 commented Oct 5, 2024

yeya24 commented Oct 6, 2024 •

edited

Loading

SungJin1212 commented Oct 6, 2024 •

edited

Loading

SungJin1212 commented Oct 6, 2024

yeya24 left a comment

yeya24 left a comment

yeya24 commented Oct 8, 2024

alanprot commented Oct 8, 2024

yeya24 commented Oct 8, 2024

Add multi-level chunk cache #6249

Add multi-level chunk cache #6249

Conversation

SungJin1212 commented Oct 2, 2024 • edited Loading

yeya24 commented Oct 4, 2024 • edited Loading

SungJin1212 commented Oct 5, 2024

yeya24 commented Oct 6, 2024 • edited Loading

SungJin1212 commented Oct 6, 2024 • edited Loading

SungJin1212 commented Oct 6, 2024

yeya24 left a comment

Choose a reason for hiding this comment

yeya24 left a comment

Choose a reason for hiding this comment

yeya24 commented Oct 8, 2024

alanprot commented Oct 8, 2024

yeya24 commented Oct 8, 2024

SungJin1212 commented Oct 2, 2024 •

edited

Loading

yeya24 commented Oct 4, 2024 •

edited

Loading

yeya24 commented Oct 6, 2024 •

edited

Loading

SungJin1212 commented Oct 6, 2024 •

edited

Loading