Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Star Tree] [Search] Keyword & Numeric Terms Aggregation #17165

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

sandeshkr419
Copy link
Contributor

@sandeshkr419 sandeshkr419 commented Jan 28, 2025

Description

  • Adds support for keyword & numeric terms aggregation.
  • Aggregations are supported with or without metric sub-aggregations
  • Aggregations are supported with previously supported queries for filtering (no change in support of adding up queries to requests)
  • Fix for timestamp field to be fetched from request/valueSource instead of hard-coded value (this change is already merged Fix date hardcoding in date aggregator #17239)

Related Issues

Resolves #16551

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Search:Aggregations labels Jan 28, 2025
@sandeshkr419 sandeshkr419 added backport 2.x Backport to 2.x branch v2.19.0 Issues and PRs related to version 2.19.0 and removed enhancement Enhancement or improvement to existing feature or request Search:Aggregations labels Jan 28, 2025
@sandeshkr419
Copy link
Contributor Author

@msfroh @bharath-techie Rebased from main again since #17239 is merged.

) throws IOException {
assert parent == null;
StarTreeValues starTreeValues = StarTreeQueryHelper.getStarTreeValues(ctx, starTree);
return new StarTreeBucketCollector(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty similar to the one in GlobalOrdinalsStringTermsAggregator. Is there opportunity to pull the common logic into an abstract base class?

Copy link
Contributor Author

@sandeshkr419 sandeshkr419 Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking in terms of Date Histogram, Keyword Terms, Numeric Terms, Range Aggregations collectively.

Actually it was possible to refactor it to a common utility. However, there were subtle differences in the aggregators on which it was implemented:

  1. valuesIterator being instanceof SortedSetStarTreeValuesIterator or SortedNumericStarTreeValuesIterator in different aggregators. This can be hidden behind a common abstract class but still isn't very neat with later things.
  2. Different valuesIterators required different handling - date being passed to rounding utilities, for range being searched in different ranges, keyword values being passed to globalOperators, numeric terms processed with correct data type conversions (as per collectionStrategy used.)
  3. Bucket collection strategy differing in range aggregation

Based in the differences in all the aggregations, it will be possible to abstract it out by using multiple generics and consumers (or relevant biconsumers, triconusmers) - however, it will make the code highly unreadable given the code logic is similar but not same.

Instead, I'd abstract out certain small pieces, like getDocCountIterator(), updateBucket() utilities instead. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstracted out common utilities. Code looks cleaner now - check it out!

Comment on lines +294 to +298
// don't defer when StarTreeContext is set, don't defer when collectMode == SubAggCollectionMode.BREADTH_FIRST
// this boolean condition can be further simplified but affects readability.
return (context.getQueryShardContext().getStarTreeQueryContext() == null || collectMode != SubAggCollectionMode.BREADTH_FIRST)
&& collectMode == SubAggCollectionMode.BREADTH_FIRST
&& !aggsUsedForSorting.contains(aggregator);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be:

if (context.getQueryShardContext().getStarTreeQueryContext() == null) {
  return false;
} else {
  return collectMode == SubAggCollectionMode.BREADTH_FIRST && !aggsUsedForSorting.contains(aggregator);
}

That is, make it even more complicated for readability. 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically we do not want to set up deferred collector when:
(context.getQueryShardContext().getStarTreeQueryContext() != null && collectMode == SubAggCollectionMode.BREADTH_FIRST).

So the queries we were resolving via star-tree resolved to BREADTH_FIRST mode only, but just to tighten up the criteria and not accidentally resolve DEPTH_FIRST mode, I just negated the above criteria leading to (context.getQueryShardContext().getStarTreeQueryContext() == null || collectMode != SubAggCollectionMode.BREADTH_FIRST).

This can be an alternative to keep the logic intact:

if (context.getQueryShardContext().getStarTreeQueryContext() == null) {
  // this will be the non-tree criteria
  return collectMode == SubAggCollectionMode.BREADTH_FIRST && !aggsUsedForSorting.contains(aggregator);
} else {
  // this will be star-tree criteria - return false (don't defer) for BREADTH_FIRST
  return collectMode != SubAggCollectionMode.BREADTH_FIRST;
}

@bharath-techie
Copy link
Contributor

@sandeshkr419 can we revisit #17165 (comment)
and #17165 (comment) .
I think we can enable numeric terms count without any benchmarks even since there are no optimizations, we can try global aggs term count after benchmarks maybe.

Copy link
Contributor

❌ Gradle check result for c8cf571: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
@sandeshkr419
Copy link
Contributor Author

sandeshkr419 commented Feb 17, 2025

Following up on #17165 (comment)

@msfroh @GBH I have added support for keyword/numeric terms aggregation without any sub-aggregations as well.

Keyword Aggregations

(non) Low Cardinality Case

To validate the performance, I indexed clientip as keyword field in the last of ordered dimensions to basically force a worse performance as my hunch was that the term frequency optimization in keyword terms would anytime outperform star-tree traversal, but upon benchmarking and profiling, I saw contradictory results. The terms dictionary lookup wasn't faster than star-tree traversal and the performance of star-tree (~150ms) was 2x better than term frequency optimization (~300 ms).
The non-optimized default flow was worst as expected (~400ms).

Based on this experimentation, I have moved up the star-tree pre-computation on higher priority than term frequency optimization. If the high cardinality field is at top in the ordered dimensions, star-tree will perform better than the current numbers, but as part of this experiment I wanted to see the worse performing scenario, which still outperforms term frequency optimization.

Priority for High Cardinality Cases:

Star Tree Optimization > Terms Enum Optimization > Default LeafCollector/collect() based aggregation

Low Cardinality Case

To validate the performance, I indexed status as keyword field in the first of ordered dimensions to get the best performance out of star-tree. To the surprise, term frequency optimization (~15ms) out-performed here over star-tree best performance (~60ms).

For aggregations with any terms queries, the term frequency optimization does not kicks in and so star-tree performs better.

Priority for Low Cardinality Cases:

Terms Enum Optimization > Star Tree Optimization > Default LeafCollector/collect() based aggregation

Numeric Aggregations

This was a simple 10-12x improvement when using star-tree(~65ms) compared to default flow (~450ms), and since there was no other optimization in place, I simply added this case handling.

Benchmarking Setup

Indexed http_logs in Mac (M1 Pro chip, 32 GB RAM), with 8G configured heap.
Ordered dimensions in star-tree: status(numeric), size (numeric), clientip (keyword), @timestamp (date)

The performance numbers in comparison are of 99th percentile service time.

Note: Terms Enum Optimization talked about in keyword terms is only applicable for cases when the query is a match all with no deleted documents in segments.

Copy link
Contributor

❌ Gradle check result for 7a0a94a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 7a0a94a: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: Sandesh Kumar <sandeshkr419@gmail.com>
Copy link
Contributor

❌ Gradle check result for f5509c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f5509c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f5509c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Search:Aggregations v2.19.0 Issues and PRs related to version 2.19.0
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[Star Tree] [Search] Bucket terms Aggregation with metric aggregation
3 participants