-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query filtering in the ingester and storage #5629
Query filtering in the ingester and storage #5629
Conversation
ee755c7
to
b66f4af
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few nit comments were added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found some minor issues but overall the code looks great to me.
Please let me know if you need any more inputs on any of my comments.
I've addressed nearly all the feedback. I still need some help from @sandeepsukhani to understand the filtering. I've added some tests cases to the filtering and everything behaves as I expect so I'm not sure what I'm missing. The tests are as follows: 2 filters:
Cases (not including timerange stuff):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good let's make sure the allMatch is good.
All match is good and I've incorporated @sandeepsukhani 's feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested some very minor nits but other than that it LGTM! Nice work!
This PR introduces the concept of a
FilteringPipeline
/FilteringSampleExtractor
that take 0 or morePipelineFilters
.PipelineFilters
are just log pipelines that run before the Pipeline/Extractor created by the query. Any PipelineFilter that returnstrue
is omitted. TheFilteringPipeline
is then passed to the underlying iterators like any other pipeline.Another notable change is that this PR changed the
Pipeline
interface to also accept timeranges. This is to accommodate the deletes that only partially affect the queried timerange.The original POC for this work had ChunkFilters but it turns out they're not needed because a delete filter needs to match on labels anyway.
There's a potential future improvement where we use a chunk filter when chunks match a delete's labels and are totally within its timerange.
This PR will change Querying behavior in installations with current delete requests so we should wait until
#5481 is merged because it contains a feature flag for delete behavior
Checklist
CHANGELOG.md
about the changes.