-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Under certain conditions, sort values for a hit come from an unrelated document #31554
Comments
Pinging @elastic/es-search-aggs |
I can confirm this happening on 6.2 and 6.3. If you reindex all items that need to be sorted soring works correctly. I think it uses some kind of cache because if you update a single document, the first non-matching document appears on the same position, shifting the items. |
@luciansabo Thank you for reporting the issue! Unfortunately I was NOT able to reproduce your issue. I followed all the steps on 6.3: put 1st doc, put 2nd doc, put 1st doc again, refresh, search. I am always getting Can you reproduce this issue on an empty index? |
Oops, I thought I had replied to this earlier but looks like I forgot to hit enter. I managed to reproduce this locally, although it wasn't consistent. Sometimes I would get I was able to drop a breakpoint and see it collect the incorrect nested doc's values for the sort, but I'm not too familiar with how nested sorted works so didn't make any further progress. Not super helpful, sorry :( This was on master fwiw. |
thanks @polyfractal, good to know that it is reproducible and worth investigating. |
@mayya-sharipova I am not the issue author, @dbevacqua is, but I can confirm a similar thing happening here. We were seeing this bug on multiple docker setups and on our staging environment, both on empty index and older indexes. For us it is very easy to reproduce. We sort scores with a condition to sort only scores of a user. |
The original bash script I posted, if run as-is, allows me to reproduce the issue every time. As per my comments in the bash script, calling Another couple of observations I have made:
I will post scripts which demonstrate these points at the earliest opportunity |
Paging @martijnvg :) Could the nested sort filter be getting confused and collecting the wrong docs or something? |
I've created a gist which uses a simpler mapping and simpler queries to illustrate the points in my previous comment: https://gist.github.com/dbevacqua/0ea04a3ad472da8e9b2c08559df874e9 Responses from all three search requests only contain doc 1, which has |
After some more experimenting I've discovered that the first PUT is unnecessary. Have revised the gist to reflect this. https://gist.github.com/dbevacqua/0ea04a3ad472da8e9b2c08559df874e9 |
The problem may be in the |
I believe this fixes the issue: diff --git a/server/src/main/java/org/elasticsearch/search/sort/SortBuilder.java b/server/src/main/java/org/elasticsearch/search/sort/SortBuilder.java
index 9537e28..e161cb0 100644
--- a/server/src/main/java/org/elasticsearch/search/sort/SortBuilder.java
+++ b/server/src/main/java/org/elasticsearch/search/sort/SortBuilder.java
@@ -238,8 +238,7 @@ public abstract class SortBuilder<T extends SortBuilder<T>> implements NamedWrit
// apply filters from the previous nested level
if (nested != null) {
- parentQuery = Queries.filtered(parentQuery,
- new ToParentBlockJoinQuery(nested.getInnerQuery(), nested.getRootFilter(), ScoreMode.None));
+ parentQuery = new ToParentBlockJoinQuery(nested.getInnerQuery(), nested.getRootFilter(), ScoreMode.None);
if (objectMapper != null) {
childQuery = Queries.filtered(childQuery, All tests pass, including the (previously failing) one I added based on the gist. I'll prepare a pull request, but perhaps someone can comment on what the intention was with forcing the |
@dbevacqua Thanks for reporting this bug and coming up with a fix! It looks like the conjunction between
The idea was that for an intermediate nested parent doc both the parentQuery and ToParentBlockJoinQuery must match (that nested doc is both a parent and a child), but that is not true in case when nested fields are missing. Whether a must or filter clause if used for that is not really important. In fact for for optimization purposes, both query clauses should be filter, because scoring is not needed. I think your fix is good and I think it makes sense to open a PR to iterate further. |
Thanks @martijnvg. PR here: #31776 |
Hello @dbevacqua , I recently authored #32130 exposing a comparable problem (however my case is a bit difference because it's reproducible 100% of the times and not depending on timing. gist with kibana commands here https://gist.github.com/cbuescher/f9c8c2132d2667d3e907a6283d3f171a). Would you have a simple way to check wether the fix you provided here would also resolve this other issue ? If not , i will build and run your PR myself. Thank you very much in advance |
@JulienColin running your gist against a server built from 6.2.3 with my PR applied, I believe the correct response is returned (family with id=2 has sort value 30)
|
@dbevacqua thank you very much for your reactivity ! it's amazing , thank you very much for your fix. I hope we can enjoy it in a release soon . |
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
The parent filter for nested sort should always match **all** parents regardless of the child queries. It is used to find the boundaries of a single parent and we use the child query to match all the filters set in the nested tree so there is no need to repeat the nested filters. With this change we ensure that we build bitset filters only to find the root docs (or the docs at the level where the sort applies) that can be reused among queries. Closes #31554 Closes #32130 Closes #31783 Co-authored-by: Dominic Bevacqua <bev@treatwell.com>
Elasticsearch version (
bin/elasticsearch --version
):6.3.0 (from docker.elastic.co/elasticsearch/elasticsearch-oss:6.3.0)
Plugins installed: []
JVM version (
java -version
):OS version (
uname -a
if on a Unix-like system):Description of the problem including expected versus actual behavior:
Under certain conditions, sort values for a hit come from an unrelated document. The specific case illustrated below involves two documents on the same shard, one of which is PUT both before and after the other without a refresh in between.
Expected behaviour: a search with multi-level nested sort targeting that document returns "missing" sort value.
Actual behaviour: a search with multi-level nested sort targeting that document returns the sort value from the other document.
We believe the problem is wider-reaching than this as we have observed the behaviour with many (or most) of our searches which use multi-level nested sorts.
Steps to reproduce:
Execute following bash script.
The search returns something like this:
the sort value of
1234
is defined in doc 2, not doc 1.Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: