Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add early termination support to BucketCollector #33279

Merged
merged 4 commits into from
Sep 3, 2018

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Aug 30, 2018

This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).

This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but makes me wonder whether we should change NO_OP_COLLECTOR to throw a CollectionTerminatedException in getLeafCollector and remove the collector == NO_OP_COLLECTOR special cases in MultiCollector?

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after discussion with @jimczi the current approach is safer. LGTM

@jimczi jimczi merged commit 713c07e into elastic:master Sep 3, 2018
@jimczi jimczi deleted the bucket_collector_early_termination branch September 3, 2018 07:34
jimczi added a commit that referenced this pull request Sep 3, 2018
This commit adds the support to early terminate the collection of a leaf
in the aggregation framework. This change introduces a MultiBucketCollector which
handles CollectionTerminatedException exactly like the Lucene MultiCollector.
Any aggregator can now throw a CollectionTerminatedException without stopping
the collection of a sibling aggregator. This is useful for aggregators that
can infer their result without visiting all documents (e.g.: a min/max aggregation on a match_all query).
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Sep 3, 2018
* master: (197 commits)
  Prevent NPE parsing the stop datafeed request. (elastic#33347)
  HLRC: Add ML get overall buckets API (elastic#33297)
  Core: Fix epoch millis java time formatter (elastic#33302)
  [Docs] Improve tuning for speed advice (elastic#33315)
  [Rollup] Fix Caps Comparator to handle calendar/fixed time (elastic#33336)
  [CI] Mute  IndexShardTests#testIndexCheckOnStartup fails elastic#33345
  [CI] Mute LuceneChangesSnapshotTests#testUpdateAndReadChangesConcurrently
  Security for _field_names field should not override field statistics (elastic#33261)
  Add early termination support to BucketCollector (elastic#33279)
  Fix extractjar task  ci  (elastic#33272)
  Mute testFollowIndexAndCloseNode
  Logging: Drop Settings from some logging ctors (elastic#33332)
  HLREST: add update by query API (elastic#32760)
  TEST: Increase timeout testFollowIndexAndCloseNode (elastic#33333)
  HLRC: ML Flush job (elastic#33187)
  HLRC: Adding ML Job stats (elastic#33183)
  LLREST: Drop deprecated methods (elastic#33223)
  Mute testSyncerOnClosingShard
  [DOCS] Moves machine learning APIs to docs folder (elastic#31118)
  Mute test watcher usage stats output
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants