-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore option of supporting more flexible search types #12316
Comments
#10217 will be required before we do this as decisions on how many phases are required will need to be made on the coordinating node so the query needs to be parsed there before we can do this. Also this could get very complex since term count accuracy would require re-running the parent aggregations to get the right context (right documents) for the terms aggregation to work on for the accuracy round and would also require running the sub-aggregations on the accuracy round (and not on the initial round) to get the right values for the sub-aggregations. This gets even more complex if multiple terms aggregations are nested all with accuracy set to true. |
We're seeing the same problem mentioned in #1305 that was closed since facets were deprecated, and we're using terms aggregations. We have a pretty complex setup with multiple shards and replicas per index, and the field being aggregated is a nested document. When we do the terms aggregation we often see buckets with wrong counts, or even no buckets returned at all. If we change the terms aggregation to a filter aggregation looking for a specific value in the nested document that should result in a bucket, we get hits returned. Note that we're not looking for "top X" buckets, just returning all buckets and trying to get an accurate count. I believe our queries were fine up until a couple of weeks ago, so perhaps there's a shard/routing/etc. setting that causes this to happen? Otherwise, please add my +1 to the request for a parameter to force accurate results, even though execution would be slower. |
@clintongormley do you think this could now be closed since we have the composite aggregation? |
@colings86 these changes are all about the top-n results, which you can't get with the composite agg without retrieving all results. i think these requests are still valid |
@elastic/es-search-aggs |
This is a rather old issue that had no activity in a long while. There are no concrete plans to work on addressing it at this time, hence I am closing it. |
Today we have
query_then_fetch
andquery_and_fetch
. This imposes a limit on the types of search functionality we can support. For instance, if you want to auto-adjust the bucket interval so that your documents fit neatly into 10 buckets, you first need to determine the min and max values in order to calculate the correctinterval
(eg see #9572 and #9531).This requires two round trips:
interval
Or to improve term count accuracy in a terms agg, you could:
Or to guarantee that you get the top 10 terms overall:
10th_count
10th_count
/num_shards
Multiple search phases would also help with clustering algorithms
The text was updated successfully, but these errors were encountered: