-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded Postings Cache can cache results without the nearly created series under high load. #6417
Merged
alanprot
merged 3 commits into
cortexproject:master
from
alanprot:expanded-postings-race
Dec 12, 2024
Merged
Expanded Postings Cache can cache results without the nearly created series under high load. #6417
alanprot
merged 3 commits into
cortexproject:master
from
alanprot:expanded-postings-race
Dec 12, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alanprot
changed the title
Head ExpandedPostingsCache: Querying and adding series concurrently can ca…
Expanded Postings Cache can cache results without the nearly created series under high load.
Dec 11, 2024
Seems that now the test is failing due: prometheus/prometheus#15141
|
alanprot
force-pushed
the
expanded-postings-race
branch
from
December 11, 2024 21:04
b167b32
to
59579cf
Compare
…che wrong results Signed-off-by: alanprot <alanprot@gmail.com>
Signed-off-by: alanprot <alanprot@gmail.com>
alanprot
force-pushed
the
expanded-postings-race
branch
from
December 11, 2024 21:52
59579cf
to
0606da3
Compare
I created a new GH workflow steps to run the test without the I needed to do that as there is a known race on prometheus that this test were also catching. |
Signed-off-by: alanprot <alanprot@gmail.com>
alanprot
force-pushed
the
expanded-postings-race
branch
from
December 11, 2024 22:04
0606da3
to
1cbcc8b
Compare
yeya24
approved these changes
Dec 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
#6296 introduced an experimental flag to cache expanded postings, including postings from the head blocks. In that PR, the TSDB PostCreation hook is used to "expire" cache keys when new metrics are added for a specific metric name.
The issue happens under scenarios of high concurrency, where a key may be expired before the new series are available for querying, leading to cached results that exclude the new series.
The root cause is that the PostCreation hook is invoked after the series is created but before it is added to memPostings:
Relevant code:
Moving the hook invocation to occur after line 1732 resolves the issue.
This PR currently only includes a test that demonstrates the problem.This PR implements a workaround to expire the series after the "commit". This workaround can be removed if the the PR on prometheus is accepted.
cc @GiedriusS
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]