-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML: Add support for single bucket aggs in Datafeeds #37544
ML: Add support for single bucket aggs in Datafeeds #37544
Conversation
Pinging @elastic/ml-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who would have thought it would all be so simple. Recursion FTW.
It would be nice to test some different cases is it possible to add some unit test coverage? AggregationToJsonProcessorTests
uses a lot of mocking and I suspect the line .filter(MultiBucketsAggregation.class::isInstance)
will fail with mocked objects so it may not be easy.
cc @richcollier This change enables filter aggregations in datafeeds see the linked issue and discuss thread for details. I wonder where we should document this, perhaps add it to the examples here |
} else { | ||
leafAggregations.add(agg); | ||
} | ||
} | ||
|
||
if (bucketAggregations.size() > 1) { | ||
// If on the current level (indicated via bucketAggregations) or on of the next levels (singleBucketAggregations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "or on of the next levels" needs correction
// Non-bucketed sub-aggregations were handle as leaf aggregations at this level | ||
for (SingleBucketAggregation singleBucketAggregation : singleBucketAggregations) { | ||
processAggs(singleBucketAggregation.getDocCount(), | ||
asList(singleBucketAggregation.getAggregations()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: don't need the asList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do, processAggs
takes a long, List<Aggregation>
where as getAggregations()
returns an Aggregations
type.
...va/org/elasticsearch/xpack/ml/datafeed/extractor/aggregation/AggregationToJsonProcessor.java
Show resolved
Hide resolved
run docbldesx |
// a bucketed agg we should treat it like a leaf in this bucket | ||
SingleBucketAggregation singleBucketAggregation = (SingleBucketAggregation)agg; | ||
for (Aggregation subAgg : singleBucketAggregation.getAggregations()) { | ||
if (subAgg instanceof MultiBucketsAggregation || subAgg instanceof SingleBucketAggregation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the test should be subAgg instanceof HasAggregations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidkyle I wish it could be :( but MultiBucketsAggregation
does not implement that interface. Its Buckets
class does though :(. No way to get to Buckets
without first casting, which requires an instanceof
check anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for adding the example to the docs
Single bucket aggs are now supported in datafeed aggregation configurations.
* elastic/master: (104 commits) Permission for restricted indices (elastic#37577) Remove Watcher Account "unsecure" settings (elastic#36736) Add cache cleaning task for ML snapshot (elastic#37505) Update jdk used by the docker builds (elastic#37621) Remove an unused constant in PutMappingRequest. Update get users to allow unknown fields (elastic#37593) Do not add index event listener if CCR disabled (elastic#37432) Add local session timeouts to leader node (elastic#37438) Add some deprecation optimizations (elastic#37597) refactor inner geogrid classes to own class files (elastic#37596) Remove obsolete deprecation checks (elastic#37510) ML: Add support for single bucket aggs in Datafeeds (elastic#37544) ML: creating ML State write alias and pointing writes there (elastic#37483) Deprecate types in the put mapping API. (elastic#37280) [ILM] Add unfollow action (elastic#36970) Packaging: Update marker used to allow ELASTIC_PASSWORD (elastic#37243) Fix setting openldap realm ssl config Document the need for JAVA11_HOME (elastic#37589) SQL: fix object extraction from sources (elastic#37502) Nit in settings.gradle for Eclipse ...
Adds support for Single Bucket aggregations in DatafeedConfig
Restrictions:
processAggs
that the current level, and the next levels (as dictated bySingleBucketAggregation
have a max of 1MultiBucketAggregation
SingleBucketAggregation
as the true leaf aggregation will be a sub aggregation.closes #36838