Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scroll API fails to apply Rescore query #31775

Closed
bwsawyer opened this issue Jul 3, 2018 · 5 comments
Closed

Scroll API fails to apply Rescore query #31775

bwsawyer opened this issue Jul 3, 2018 · 5 comments
Assignees
Labels
>enhancement good first issue low hanging fruit help wanted adoptme :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@bwsawyer
Copy link

bwsawyer commented Jul 3, 2018

Elasticsearch version (bin/elasticsearch --version):
Elasticsearch 6.3.0

Plugins installed: []
None

JVM version (java -version):
java version "1.8.0_171"

OS version (uname -a if on a Unix-like system):
Darwin mbp-bsawy-5018 17.6.0 Darwin Kernel Version 17.6.0: Tue May 8 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64

Description of the problem including expected versus actual behavior:
Performing a query using the Scroll API does not apply the 'rescore' portion of the query.

Steps to reproduce:

  1. Create test index with one document:
PUT index

POST index/_doc
{
  "field": "value"
}
  1. Run query with a rescore and see that the score returned for the hit is as expected (eg. _score=101)
GET index/_search
{
  "query": {
    "match_all": {}
  },
  "rescore": {
    "query": {
      "rescore_query": {
        "function_score": {
          "weight": 100
        }
      }
    }
  }
}
  1. Run same query but with a scroll and that the score seems to indicate that the rescore was not applied (eg. _score=1)
GET index/_search?scroll=1m
{
  "query": {
    "match_all": {}
  },
  "rescore": {
    "query": {
      "rescore_query": {
        "function_score": {
          "weight": 100
        }
      }
    }
  }
}

This appears to affect all 6.* versions of Elasticsearch but is not present in 5.*

@jimczi jimczi added >bug :Search Relevance/Ranking Scoring, rescoring, rank evaluation. labels Jul 4, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jimczi
Copy link
Contributor

jimczi commented Jul 4, 2018

@bwsawyer I marked this issue as a bug because it works on 5.x but I wonder why you'd use a rescorer in a scroll context ? Is there something you cannot achieve with the main query ? A scroll needs to score all documents which defeats the purpose of the rescorer. Maybe we should simply fail or document the fact that scroll and rescore don't play well together ?

@jimczi jimczi added the discuss label Jul 4, 2018
@bwsawyer
Copy link
Author

bwsawyer commented Jul 5, 2018

In our use case we use the main query as a blocking step to retrieve good candidates for a more computationally intensive rescore function (in a custom plugin). Running the function over all the documents in the index isn't feasible, nor can we accomplish our blocking query as a pre-filter.

Part of our application needs to retrieve all documents that score above a certain threshold after applying that rescore function. Sometimes the number of documents above the threshold is large enough that I assume scrolling would be more efficient than paging, at least as I understand it. I admit I have not done any benchmarking to verify this assumption.

@jimczi
Copy link
Contributor

jimczi commented Jul 9, 2018

Scrolling is faster than simple pagination but the issue I have with rescored scroll is that it breaks the _score sort. In your example the main query is a match_all so each page is rescored up to window_size but documents with greater score can be returned in subsequent pages. For this reason I don't think rescored scroll should be possible or at least we should disable sort in this case.

@jimczi jimczi added good first issue low hanging fruit help wanted adoptme and removed team-discuss labels Jul 16, 2018
@jimczi
Copy link
Contributor

jimczi commented Jul 16, 2018

We've decided to forbid scroll with rescores, it used to work in 5x but should be considered as a bug since it breaks the sort. Though we should fail the creation of the scroll instead of silently ignoring the scorers as we do today which is why I am marking this issue as an adoptme.

@jimczi jimczi added >enhancement and removed >bug labels Jul 16, 2018
@not-napoleon not-napoleon self-assigned this Aug 1, 2018
original-brownbear pushed a commit to original-brownbear/elasticsearch that referenced this issue Aug 28, 2018
This PR changes our behavior from silently ignoring rescore in a scroll query to instead report to the user that such a query is invalid.

Closes elastic#31775
not-napoleon added a commit that referenced this issue Aug 29, 2018
This adds a deprecation warning for using rescore on scroll queries in 6.x. As per #31775 we will not be supporting this going forward.

See also #32918 which implements the validation error for 7.0
@javanna javanna added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement good first issue low hanging fruit help wanted adoptme :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants