-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple levels of field collapse #24855
Comments
Just a quick note that I myself came across this requirement, and decided to approach it with the terms+terms+top_hits approach. With 7000 documents returned by the query before aggregations, I found that the performance tripled when I added the final top_hits aggregation. (terms+top_hits was fine, terms+terms was fine, terms+terms+top_hits was awful). I raised this as a forum post here: https://discuss.elastic.co/t/top-hits-performance-inside-2-levels-of-terms-aggregations/92266 But it's perhaps useful to see it here too. |
+1 We are also using term->terms->TopHits to solve a problem, where multiple level of collapse with pagination will be very useful. But the question is would it be better to have Parent->Child Structure to solve this problem performance wise instead of having N-level nesting for collapse ? |
@elastic/es-search-aggs |
Introduce collapsing on multiple fields `field` field in the `collapse` request in addition of taking a string, can take an array - fields on which to collapse. Limitation: all fields in the field collapsing request must be of the same type, either all are of keyword or numeric type. Example request: ```json { "query": { "match": { "address": "victoria" } }, "collapse" : { "field" : ["country", "city"] } } ``` Example response: ```json { ... "hits": [ { ... "fields": { "country": [ "Canda" ], "city": [ "Saskatoon" ] } }, { ..., "fields": { "country": [ "Canada" ], "city": [ "Toronto" ] } }, { ..., "fields": { "country": [ "UK" ], "city": [ "London" ] } } ] } ``` Breaking changes: The internal format between nodes for TopDocs for a collapsing request has been changed. TODO: 1. Limit the number of fields for multiple collapsing 2. Return 400x instead of 500x for field types on which collapsing can't be done (all types except keyword or numeric) Closes #24855
* Put second level collapse under inner_hits Closes #24855
Put second level collapse under inner_hits Closes #24855
Wondering why did we give up from approach introduced in #31557? Are there plans to support something like that in the near future? |
Describe the feature:
Currently, we support 1 level of field collapse + inner hits (https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search-request-collapse.html#_expand_collapse_results). However, some users need to collapse multiple levels (while still providing features provided by field collapse, e.g. pagination), e.g. to simply provide the top item for 2 tiers of collapsing. For those users, they have a few options:
inner_hits
size
and then reduce the set client-sideWhen 1 is not an option for business purposes, this leaves you with options 2 and 3. 2 is a fairly abusive/expensive way to solve it when you're looking for just the top result under each of the top-level collapse. 3 adds a lot of client code as well as round trips (which increase overall query latency).
Unfortunately, opening up multiple levels also opens the query up for abuse in different ways (asking for high depth)
I wanted to open up for discussion the possibility of having multiple levels of field collapse, even if there were additional restrictions (e.g. only 2 levels, only can return a much smaller result set from the second level, etc)
The text was updated successfully, but these errors were encountered: