-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize away label=~".*" matchers in TSDB layer. #6996
Optimize away label=~".*" matchers in TSDB layer. #6996
Conversation
Can we also tackle the case |
By returning empty set, and thus postings? Yes, I think we can. |
I did not add a new test for prometheus/tsdb/querier_test.go Lines 2076 to 2078 in fe802f2
|
tsdb/querier.go
Outdated
if m.Type == labels.MatchRegexp && (m.Value == ".*" || m.Value == "^.*$") && len(ms) > 1 { | ||
// Ignore this matcher completely. This matches any value, including no value, so it's a no-op. | ||
// It's safe to ignore, because there must be some matcher matching non-empty string. | ||
// Some tests only use single label=~".*" matcher, and for those, we include the length condition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just about tests, it's about correctness. Can we fast path those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I've modified the code to use all postings if we find this kind of matcher. We only return all postings, if there are no other positive matchers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my side.
NOTE: The same thing would be nice to implement in Thanos - we don't use PostingForMatchers.
We do it here: https://github.com/thanos-io/thanos/blob/2dc375b7a64c5f0d7da488b44664fef1d4865871/pkg/store/bucket.go#L1334
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still doesn't look right. Can you add to the tests we have for the other regex fast path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. And as Brian said, some test cases for fast path and cases around that would be good.
We've had regressions before for things like this, so we really need a test. |
I've added more test cases covering the shortcuts, and verified that they work with old version of
I'm not quite sure how to approach this test. Any suggestions? |
See #6540 |
Do you suggest to move these shortcuts into a separate function to make it more testable? I'd prefer to avoid doing that, as it would complicate the code. Given that, I'm out of ideas how to test it in a way that you're suggesting. |
That's how we're doing existing tests of this sort of thing, so I hoped it'd provide some inspiration. I mainly care that it is tested, so any reasonable approach. |
I think I finally have a good way to test these shortcuts -- I simply check if returned posting is empty or "all", or expected label postings match. PTAL. ( |
Tests are failing in |
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>
Is there more I should do here, or does the test look good now? |
Thinking on this there, this optimisation isn't correct at all as |
That's a valid point. Since it is not very typical to have newlines in label values, and most people mean to match everything when they write But I assume that such discussion has already happened in Prometheus before, and the answer is No. |
It's even remotely plausible that somebody uses |
That would be a breaking change to PromQL, and thus not something that's doable in the current major version of Prometheus. We'd also have to change how regexes across the other repos, and for the SNMP exporter at least newlines in strings have come up. |
You're right. 😞 |
Thanks for your review and feedback! |
Should we have unit tests around this? |
Note: We have already got requests from users which use new lines in labels, so that is a thing. |
This is alternative to #6995, except it removes these useless matchers in TSDB layer.
Benchmarks in progress, will update the post when finished.
Update: well, first benchmarks show difference where you'd expect -- when using multiple matchers, one of them being
~= ".*"
. I need to let them run longer, and they block my computer, so I'll do that later.