-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close block series client at the end to not reuse chunk buf #7915
Merged
MichaHoffmann
merged 2 commits into
thanos-io:main
from
yeya24:close-block-client-at-end
Nov 18, 2024
Merged
Close block series client at the end to not reuse chunk buf #7915
MichaHoffmann
merged 2 commits into
thanos-io:main
from
yeya24:close-block-client-at-end
Nov 18, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: Ben Ye <benye@amazon.com>
This probably also fix #7883 (comment) since we revert back the previous change to always close the block series reader. |
@@ -1572,6 +1572,8 @@ func (s *BucketStore) Series(req *storepb.SeriesRequest, seriesSrv storepb.Store | |||
tenant, | |||
) | |||
|
|||
defer blockClient.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think thats essentially how LabelValues and LabelNames do it too right now!
MichaHoffmann
approved these changes
Nov 18, 2024
fpetkovski
approved these changes
Nov 18, 2024
2 tasks
2 Approvals, only doc check failing, ill merge; this should go into 0.37 rc |
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In Cortex we run some query fuzz tests to make sure query results are compatible with latest Cortex or Prometheus release.
I noticed a strange query failure when comparing Cortex query results with latest Prometheus v2.55.1 result https://github.com/cortexproject/cortex/actions/runs/11858364637/job/33061424720?pr=6340. Cortex Block is loaded in Store Gateway and the same block is loaded in Prometheus. I was trying to find out the issue and reproduce it locally and found out that sometimes (1 run out of 1000 ish) query results from Cortex only return 2 series rather than expected 3 series.
I added some logs on to print out the chunk content for the series it misses. It seems that the chunk buf gets reused even before the series response sent over gRPC.
The issue here seems with https://github.com/thanos-io/thanos/pull/7821/files#diff-3e2896fafa6ff73509c77df2c4389b68828e02575bb4fb78b6c34bcfb922a7ceR3357, the block chunk reader is closed when the loser tree closes certain response series set. This doesn't guarantee the block chunk reader is closed at the end of the
Series
call because when a response series set is exhausted it will be closed first at https://github.com/thanos-io/thanos/blob/main/pkg/losertree/tree.go#L66 and then the chunk buffer can be reused.Changes
Since the original idea of #7821 is to fix an issue for the in process store client, I revert the code for the block series client back. Now it closes block series client at the end of the
Series
function to make sure chunk buffere is not reused before the function returns.Verification
I don't have a unit test to reproduce this bug but I was just re-running the same test case I use to reproduce the fuzzy test bug. Without this bug fix the test failed constantly. Not every run but 10000 queries probably fail 5 times. With the fix it always succeed.