You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently sub aggregation is not supported in filter rewrite optimization, only single date histogram is supported.
This makes the applicable scenarios very limited. It would be great we can find a way to support sub aggregation while applying the filter rewrite optimization.
I notice one possible path when applying the optimization to composite aggregation previously. There's a established pattern to defer the sub aggregation collection. The idea is to do the aggregation collection in 2 pass. 1st pass is to get the docIdSets per bucket, 2nd pass is to run the collection of the sub aggregation on these docIdSets per bucket.
// aggregations should only be replayed on matching documents
assertscorerIt.docID() == docID;
}
collector.collect(docID);
}
}
Theoretically, the performance improvement still comes from using index structure instead of iteration to get the matching docs to collect at the date histogram level. Sub aggregation collection on these matching docs is expected to be at same speed. And there would be some memory cost of saving the docIdSets for a certain period for 2nd pass.
In the end, we are expected performance improvement on these 2 operations from big5 workload. These operations have sub-aggregation.
Some other issues will also improve the performance of sub-aggregation, and they are coming from indexing side — compute some special index structure to improve the sub-aggregation performance, whereas this approach is focused on the query-time improvement. #3734 #12498
The text was updated successfully, but these errors were encountered:
getsaurabh02
changed the title
Support sub aggregation in filter rewrite optimization
[Proposal] Support sub aggregation in filter rewrite optimization
Sep 6, 2024
Follow up task of #9310
Currently sub aggregation is not supported in filter rewrite optimization, only single date histogram is supported.
This makes the applicable scenarios very limited. It would be great we can find a way to support sub aggregation while applying the filter rewrite optimization.
I notice one possible path when applying the optimization to composite aggregation previously. There's a established pattern to defer the sub aggregation collection. The idea is to do the aggregation collection in 2 pass. 1st pass is to get the docIdSets per bucket, 2nd pass is to run the collection of the sub aggregation on these docIdSets per bucket.
OpenSearch/server/src/main/java/org/opensearch/search/aggregations/bucket/composite/CompositeAggregator.java
Lines 648 to 673 in 246557c
Theoretically, the performance improvement still comes from using index structure instead of iteration to get the matching docs to collect at the date histogram level. Sub aggregation collection on these matching docs is expected to be at same speed. And there would be some memory cost of saving the docIdSets for a certain period for 2nd pass.
In the end, we are expected performance improvement on these 2 operations from big5 workload. These operations have sub-aggregation.
Some other issues will also improve the performance of sub-aggregation, and they are coming from indexing side — compute some special index structure to improve the sub-aggregation performance, whereas this approach is focused on the query-time improvement.
#3734
#12498
The text was updated successfully, but these errors were encountered: