-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add early termination support to BucketCollector #33279
Add early termination support to BucketCollector #33279
Conversation
This commit adds the support to early terminate the collection of a leaf in the aggregation framework. This change introduces a MultiBucketCollector which handles CollectionTerminatedException exactly like the Lucene MultiCollector. Any aggregator can now throw a CollectionTerminatedException without stopping the collection of a sibling aggregator. This is useful for aggregators that can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good but makes me wonder whether we should change NO_OP_COLLECTOR to throw a CollectionTerminatedException in getLeafCollector and remove the collector == NO_OP_COLLECTOR
special cases in MultiCollector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually after discussion with @jimczi the current approach is safer. LGTM
This commit adds the support to early terminate the collection of a leaf in the aggregation framework. This change introduces a MultiBucketCollector which handles CollectionTerminatedException exactly like the Lucene MultiCollector. Any aggregator can now throw a CollectionTerminatedException without stopping the collection of a sibling aggregator. This is useful for aggregators that can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
* master: (197 commits) Prevent NPE parsing the stop datafeed request. (elastic#33347) HLRC: Add ML get overall buckets API (elastic#33297) Core: Fix epoch millis java time formatter (elastic#33302) [Docs] Improve tuning for speed advice (elastic#33315) [Rollup] Fix Caps Comparator to handle calendar/fixed time (elastic#33336) [CI] Mute IndexShardTests#testIndexCheckOnStartup fails elastic#33345 [CI] Mute LuceneChangesSnapshotTests#testUpdateAndReadChangesConcurrently Security for _field_names field should not override field statistics (elastic#33261) Add early termination support to BucketCollector (elastic#33279) Fix extractjar task ci (elastic#33272) Mute testFollowIndexAndCloseNode Logging: Drop Settings from some logging ctors (elastic#33332) HLREST: add update by query API (elastic#32760) TEST: Increase timeout testFollowIndexAndCloseNode (elastic#33333) HLRC: ML Flush job (elastic#33187) HLRC: Adding ML Job stats (elastic#33183) LLREST: Drop deprecated methods (elastic#33223) Mute testSyncerOnClosingShard [DOCS] Moves machine learning APIs to docs folder (elastic#31118) Mute test watcher usage stats output ...
This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).