Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing the function_score with discrete queries #27588

Closed
polyfractal opened this issue Nov 29, 2017 · 34 comments
Closed

Replacing the function_score with discrete queries #27588

polyfractal opened this issue Nov 29, 2017 · 34 comments
Labels
>deprecation >enhancement :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@polyfractal
Copy link
Contributor

polyfractal commented Nov 29, 2017

Overview

The function_score is a powerful query, but can be somewhat unwieldy and difficult to use. It is a monolithic query that has many parameters and options which makes it difficult for new users to learn. It can be difficult to tweak, and now that we've moved to BM25, the defaults don't work well (namely, multiplying scores).

We'd like to deprecate function_score and replace it's functionality with a number of discrete queries. The guiding principle is that these replacement queries will be small, simple and single-purpose. Individual queries should be simpler to implement and maintain, easier for users to learn and should allow the same functionality as function_score by mixing/matching as required.

New queries / needed functionality

Arbitrary numeric functions

This query will allow working with arbitrary numerics and allow applying various mathematical functions. For example, a document may have a popularity field that you wish to roll into the score somehow.

Functions should include those of field_value_factor, like logarithm, sqrt, reciprocal, etc. We'd also like to include a sigmoid and rational function. The query should also be able to return just the value itself. I think the random number functionality of function_score can be rolled into this too.

We could potentially implement this query entirely through scripting, assuming we can provide custom functions through the script context for more complicated operations like sigmoid.

Distance: Numeric and Time

This query would provide a set of decay functions that allow you to score the "distance" from a field value to some point. When dealing with dates this "distance" would essentially represent recency, while with numerics it'd just be a geometric distance.

We'll likely want numeric/time to be grouped together since the operations are essentially the same.

Distance: Geo

This would be similar to numeric/time distance, except operating on geo points and physical distance. The thinking is that geo would be separate, as we may have plans in the future for geo querying to become more robust in general. Geo points are also sufficiently different that the syntax will likely need to be a bit different.

Otherwise the functionality is similar, providing a set of decay functions.

Potential queries

Some other potential queries that we kicked around, unsure if they are needed or quite how the functionality would work. Some of these have a direct comparison in function_score, some are only tangentially related.

Min/Max query

Function score allows taking the min or max score from the set of functions. We could potentially add a combination query that executes the children, then only passes on the min/max score from the children.

Bandpass query / cutoff score query

Function score allows setting max thresholds on scores generated by the functions. It may be useful to have a query that allows setting min or max or both, and the scores that come out of the query would be limited to those values. Similar to a constant score in that it wraps a set of queries, but instead of setting a single score it just limits the scores that are generated to the range.

Note this is different from the above min/max query, in that it limits the produced score to a range, whereas min/max simply takes the min or max score as-is.

Would allow us to remove min_score (#13115), and limit individual queries (#17348)

In-order Boolean

One unique aspect of function_score is the ability to use only the "first" matching function. There's no other place in the query DSL that allows "short-circuiting" evaluation... the equivalent "first" functionality would require a complex set of must/must_not boolean conditions.

So it may be useful to implement an "in-order" boolean query which evaluates child queries in their order and allows exiting after some criteria is met (first, etc).

It's not entirely clear how useful this functionality is outside of the function_score though, and we'd be interested to hear use-cases for this first behavior.

Scripted Boolean

Related to all the above, it may be useful to have a boolean that executes all the child queries and provides those scores to a script, which would then decide how to combine the scores. This could allow very sophisticated behavior by allowing the user to script away which scores are included (e.g. if they meet a criteria, or only if the total boolean exceeds a threshold, etc) and how they are combined

The downside is that all child queries must be run so that all scores can be collected. And the syntax/script interface would likely be complex

Related issues

#23850
#15670 (by boosting each individual decay query)

/cc @mayya-sharipova @colings86 @clintongormley did I miss anything?

@polyfractal polyfractal added :Search/Search Search-related issues that do not fall into other categories >deprecation >enhancement labels Nov 29, 2017
@rjernst
Copy link
Member

rjernst commented Nov 29, 2017

I think this is just moving the complication from one form to another. The fact that there is a "query" to control the score of other queries is what is confusing.

Function score is about manipulating the score. IMO we should use scripts for any manipulation of the score. Imagine every query node allowing a script to modify the score of that node. In the case of boolean query, we can provide an array of sub scores. Things like decay should just be functions made available in painless.

@clintongormley
Copy link
Contributor

@rjernst could you provide some examples of what this would look like?

@colings86
Copy link
Contributor

I had actually understood the proposed decay queries as not modifying the score of an inner query but as both selecting and scoring the documents. So for example with the geo_distance query (which we already have) we could extend it to allow users to set decay parameters on that actual query. That way the query would be both selecting and scoring the documents. The parameters would then be origin, offset, scale, decay, decay_type as we have in the decay function now and also cut_off which would indicate where to stop collecting documents (the same as the distance parameter does now. The default could then be to set the decay parameters to 0 and the decay_type to none so it works as a constant score query like the current implementation. A similar thing could be done for a number_distance query for numeric values.

@polyfractal
Copy link
Contributor Author

Sorry, I probably didn't explain well in the OP. ++ What @colings86 said, my understanding is that geo_distance and number_distance would operate similarly, and ideally the "arbitrary numeric manipulation" query would just be a script with built in painless functions.

The rest -- min/max, cutoff, etc -- I think are definitely up for debate as to their usefulness and how they should be implemented.

FWIW, I don't think queries controlling other query scores is unusual/unprecedented. constant_score, boosting and all the compound queries (via their boost param) manipulate the score generated by their children queries. Personally it feels natural to me that you can control the score of a group of queries by wrapping them with a modifier. In the vein of each query doing only one thing, you have a query to limit scores rather than parameters (or scripts) on every query that does the same thing.

I'm not opposed to widespread usage of scripting, but I think we should be careful not to push too much complexity into scripting. Scripts are troublesome to debug and generally more difficult for ORM/high-level clients/applications to integrate (imo).

For example, an app can generate a query and hand it over to another portion of code, which blindly wraps it in a cutoff query to limit the scores. In the scripting world, the receiving code would have to modify the top-level query component and add a script. It's a subtle difference but noticeable in ergonomics I think.

@mayya-sharipova
Copy link
Contributor

One thing that could be useful in function_score or new queries is the ability to normalize a score. Citing a user comment from #23850:

I'd like to do things like normalize values so they have predictable ranges and are guaranteed to be not 0

@rpedela
Copy link

rpedela commented Dec 10, 2017

I want to second normalizing a score! I typically use boolean queries with multiple match queries as the base query which can yield top scores in the 10s or 100s or sometimes less than 1.0. It is very difficult to combine popularity or recency because they aren't necessarily on the same scale. Additionally the base query's scale can change between queries adding another level of complexity to manually setting boosts/weights. Being able to normalize the base query and then combine with a normalized function_score query would be amazing.

@mayya-sharipova
Copy link
Contributor

#15670 has another idea in the topic of normalizing scores: add bbility to set percentage influence of each function in function score query

@jpountz jpountz added :Search Relevance/Ranking Scoring, rescoring, rank evaluation. and removed :Search/Search Search-related issues that do not fall into other categories labels Apr 25, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@mayya-sharipova
Copy link
Contributor

@rpedela and others interested in normalization:

It turned out that correctly implemented normalization will be quite slow, as we first need to iterate across all docs to find min and max values across all docs for this feature/score, and then again to iterate across docs to assign them score as docValue * (MaxAcrossDocs-MinAcrossDocs)/(normalizedMax-normalizeMin). As this will be a quite slow process, it is reasonable NOT to implement the normalization at all. One thing we can consider is to implement normalization only for the rescoring phase.

As an alternative, we have added a new query type - feature query, that allows to boost the score of documents based on the values of numeric features (e.g. popularity, recency etc). This query has an additional benefit of skipping non-competitive hits, and being very fast in certain cases.

@lmenezes
Copy link
Contributor

lmenezes commented Jul 5, 2018

@mayya-sharipova Is there a schedule regarding replacing function_score with smaller/more specialised queries?

I'm still interested in a solution for #24910 (and would be happy to try and contribute if that's the way to go), but also would be interested in expanding the current behaviour for field_value_factor to support at least another option: using the value specified as factor as either a multiplication or a sum.

Would any of this be welcome, as either changes to current function_score, or as something new?

@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Jul 5, 2018

@lmenezes I have marked #24910 for our next team discussion, and I will update the issue with what we have decided.

About adding another option for field_value_factor, why don't you just use script_score where using script you can do any calculations with a field value? Our plan is to eventually only have script_score , to add more script functions and to abandon all other options (including field_value_factor).

@lmenezes
Copy link
Contributor

lmenezes commented Jul 6, 2018

@mayya-sharipova For performance.

We use function score a lot, normally combining multiple functions on the same query over datasets of varying size(ranging from few million to >1B). We have come to find that the performance difference between field_value_factor and script_score can be pretty big.

For one of our use cases, replacing a single field_value_factor function with a script_score is around 25% slower on the script side. Replacing 2 field_value_factor with equivalent script_scores, this already goes up to 40%(although combining the logic for both script_score on the same function keeps the difference at around 25%, so this is good). All of this on a 20mm dataset, with queries that will match <5million documents.

I can reproduce these numbers by running with JMeter a coule thousand queries where the functions look like this(query/filters omitted for simplicity), where only the functions vary:

"functions": [
  {
    "script_score": {
      "script": {
        "params": {
          "foo": 10
        },
        "source": "Math.log(params.foo * doc['some_field'].value)"
      }
    }
  }
]
"functions": [
  {
    "field_value_factor": {
      "field": "some_field",
      "factor": 10,
      "modifier": "log2p"
    }
  }
]

&

"functions": [
  {
    "script_score": {
      "script": {
        "params": {
          "foo": 10,
          "bar": 50
        },
        "source": "Math.log(params.foo * doc['some_field'].value) + Math.log(params.bar * doc['some_other_field'].value)"
      }
    }
  }
]
"functions": [
  {
    "field_value_factor": {
      "field": "some_field",
      "factor": 10,
      "modifier": "log2p"
    }
  },
  {
    "field_value_factor": {
      "field": "some_other_field",
      "factor": 50,
      "modifier": "log2p"
    }
  }
]

So, I do understand that script_score opens doors to more sophisticated use cases, and that it really isn't realistic to model every single scenario with a special function. But I'm still hoping the existing functions won't be dropped(or that script_score performance will be close enough) :)

@mayya-sharipova
Copy link
Contributor

@lmenezes Than you for the thorough description of your use-case. We will definitely consider in making our final decision about the future of function score query.

@rjernst
Copy link
Member

rjernst commented Jul 6, 2018

@lmenezes Can you try using expression scripts? They should be equivalent performance-wise to a native script. We know there are a few things that can make painless scripts less efficient in these cases (eg hash lookups for the field on every document), and we have ideas for fixing that, so performance should continue to improve there.

@lmenezes
Copy link
Contributor

lmenezes commented Jul 7, 2018

@rjernst Just ran a test including expression scripts to see how this played out.

Results are as following:

Script(painless)

     "functions": [
        {
          "script_score": {
            "script": {
              "params": {
                "constant": 2
              },
              "source": "Math.log((params.constant * doc['some_field'].value) + 1)"
            }
          }
        }
      ]
summary =   2000 in 00:03:00 =   11.1/s Avg:   889 Min:   781 Max:  1297 Err:     0 (0.00%)

Script(expression)

      "functions": [
        {
          "script_score": {
            "script": {
              "lang": "expression",
              "source": "ln((2 * doc['some_field'].value) + 1)"
            }
          }
        }
      ]
summary =   2000 in 00:02:57 =   11.3/s Avg:   871 Min:   754 Max:  1357 Err:     0 (0.00%)

field_value_factor

      "functions": [
        {
          "field_value_factor": {
            "field": "some_field",
            "factor": 2,
            "modifier": "ln1p"
          }
        }
      ]
summary =   2000 in 00:02:33 =   13.1/s Avg:   721 Min:   494 Max:  1470 Err:     0 (0.00%)

So, there is some improvement when comparing painless with expression, but not really significant. Field value factor still offers much better performance.

@mayya-sharipova
Copy link
Contributor

@lmenezes Thanks for presenting an example. But from your example, the difference between script and field_value_factor doesn't seem to be that big.
We have discussed the issue, and decided to run comprehensive performance tests to more systematically evaluate the difference in performance between painless script and other current options in Function Score query.

@lmenezes
Copy link
Contributor

@mayya-sharipova A 20% difference doesn't seem so little to me(specially when this is a single function of potentially many).

Looking forward to the results of your performance tests and any other info regarding the future of function score query in general :)

@Munawwar
Copy link

+1 for sigmoid function.
I was thinking today about how to normalize data to range 0 to 1. The proper way to normalize is to find the min and max and transform the range to 0-1. But this is very inefficient (and min/max needs recalculation every time a new document is inserted/updated/deleted).
So some trade-offs are ok, as long as it still represents the data. So a sigmoid function would be useful. (I might go for fn(x) = 1 - 1/(1+Math.log10(1+x))... it's smooth and can handle very large numbers)

@Sandeep42
Copy link

Sandeep42 commented Nov 8, 2018

This is becoming very crucial for our work especially Scripted Boolean and Band Pass queries.

What we are dealing is a use case where our scoring signals are coming from individual child queries, and they have to be reduced using a certain combination. with a possibility of setting a min-max bandpass filter . Since there is no way I can pass the output of one query as the input of the other except in rescoring (even in rescoring, I'm only able to to some predefined actions like addition, linear combination of scores, not a script), this limits our ability to retrieve good documents from a pool.

Implementing such functionality like this would immensely help us. In machine learning based retrievals (especially models tuned with learning to rank algorithms), we have some derived variables which should be dynamically computed at the query time. As an example, if one of my parameters of my model is the amount of rain in the last month for a particular city, I would like to do a full text search on the city and compute last month rain using an aggregation. This input needs to be passed on to a final "reduce" query where I will specify how that should be collapsed into a single score. The final model could be a linear combination of such scores.

Coming to the band pass filter, If the amount of rain is too low or too high, I might want to consider that as an edge case and score it less because my model would do wrong prediction in such cases.

@mayya-sharipova I see that you are working on reimplementation of the old function_score query. It would be great if you could consider these features. Please let me know if there is anything I can help with. Thanks!

@mayya-sharipova
Copy link
Contributor

@Sandeep42 Thank you very much for your input. For our new script_score query, we don't plan to do anything fancy, it will be just an imitation of function_score query with some extra functions added.

For your use case to combine signals from individual child queries, you can:

  1. Use a custom rescorer. Here is an example how to implement your own rescorer.

  2. You can submit a new issue into elasticsearch repo, and we can discuss if there is something we can/would like to implement.

@Sandeep42
Copy link

Sandeep42 commented Nov 12, 2018

@mayya-sharipova Thanks for quick response. I will try to understand custom rescorer.

@mayya-sharipova
Copy link
Contributor

We have met and discussed whether we should implement the proposed on this issue queries. We have not reached a conclusion yet, and need another follow-up meeting. For now, we still can't deprecate function score query, as some work still needs to be done (e.g. still need to decide how to imitate function score query functionalities for multiple filters and first parameter etc).

@mayya-sharipova
Copy link
Contributor

We have met again, and discussed the status of Min/Max query, Bandpass query / cutoff score query, In-order Boolean and Scripted Boolean. We concluded that all these types of queries can be implemented as a part of Scripted Boolean types.
We proposed two possible ways how Scripted Boolean can be implemented:

  1. As a part of script_score query
{
  "query": {
    "script_score": {
      "queries": [
        "match": {
          "message": "elasticsearch"
        },
        "match": {
          "author": "shay banon"
        }
      ],
      "script": {
        "source": "_scores[0] + _scores[1]"
      }
    }
  }
}
  1. As a standalone query
{
  "query": {
    "script_bool": {
      "queries": [
        "match": {
          "message": "elasticsearch"
        },
        "match": {
          "author": "shay banon"
        }
      ],
      "script": {
        "source": "_scores[0] + _scores[1]"
      }
    }
  }
}

Points that still need to be clarified:

  1. What is the way to implement in-order boolean?
    One idea is to assign a score of -1 to a non-matching query. Then in the script, we can do:
    if (_scores[0] < 0) return _scores[1]. This basically says, if 1st query doesn't match a doc, choose a score from the second query for this doc.
  2. What should be provided inside a script context?
    • just scores?
    • Scorer objects? Plus: allows more control, minus: potential for misuse and bad manipulation. We can use DisjunctionScorer::getChildren method to get all children Scorer.

@jpountz
Copy link
Contributor

jpountz commented Dec 18, 2018

Proposals 1 and 2 look very similar, only the name of the query is changing?

One idea is to assign a score of -1 to a non-matching query.

I'm afraid this might make the API harder to use compared to 0 as you would likely need to check every input score against -1 eg. if you want to sum up scores?

What is the way to implement in-order boolean?

If something like in-order is a popular ask, we might want to implement it as a proper query instead, which could then leverage upcoming optimizations to skip documents whose score is not competitive. These things are harder (impossible?) to do with scripted scores.

DisjunctionScorer::getChildren

Let's not expose this in scripts, this API is not properly implemented across all queries.

@lmenezes
Copy link
Contributor

Is there any update on this topic? :)

This (#24910) is currently still painful and solved by work arounds on our side.

I understand you want to avoid changes on function score in favor of these discrete queries, but so far this is not yet available and the expected behaviour on function score is difficult to work with. Scripting would be an option, but performance wise it is limiting.

@morphles
Copy link

morphles commented Oct 24, 2019

Seriously, removing function score for scripts?
Am I the only one who sees this as quite a boneheaded decision. Even though as OP said function score might not be easy to learn, and I'd add it is mildly unwieldy, I find it to be quite great (as in I was able to express quite complicated scoring declaratively and that makes me a happy boy).

First from my experience it seems to have quite a bit better performance than script_score (at least script_score wrapped in function score). Next, I find scripts quite annoying, as they mean you need to learn and handle another language and API basically. I also like that function score and queries have to basically be declarative, while scripts you can write in some abhorrent imperative style :) .

Generally the title of this issue is very appealing to me though. Having light wrapper queries operating on scores of sub-queries would be neater. While replacement of function score with script score seems like step back in my eyes.

@mayya-sharipova
Copy link
Contributor

@morphles Thank you for voicing your concerns. I agree performance is indeed a valid issue in scripts and we are working on addressing it. All other matters that you voiced are of personal preferences.

@morphles
Copy link

Well another language to learn might be mildly personal, but also has some objectivity to it. Why use more parts than needed? Why learn/use more stuff if you can do it with less? Granted in web dev world this "insanity" (in my eyes) has no limit, so one must get used to it by necessity, still... I'd rather not need billions of tools :)

@rjernst
Copy link
Member

rjernst commented Oct 24, 2019

another language to learn

The DSL is a language, and unique within Elasticsearch, so it's already something one must learn. While declarative DSLs like the json used for queries are nice for certain situations, in the case of scoring and many other advanced cases, where basic arithmetic and mathematical functions are desired, having a procedural language is much more natural than eg declaring a "sum" node with children and various keyed settings. It is also much more straightforward to reuse this procedural pattern in advanced cases throughout Elasticsearch, and this is why we are continuing to expand the places scripting can be used. Additionally, the scripting and painless infrastructures are highly extendable to add additional functions, while function score was minimally extendable and extremely cumbersome to do so. We appreciate the feedback, especially the note about performance and as Mayya noted we are working on improving that.

@morphles
Copy link

Well DSL or not, script score is still more stuff DSL + painless, which is elastic specific thing. So while I agree mostly with what you wrote, I'm still for keeping function_score :) .

What in my eyes would be really nice, is extending DLS to allow to operate on "named predicates". For example in my function_score I have lots of filters with names, in case I need to return them what matched. So what would be convenient would be to be able to list predicates (as in basically sub queries of various sort) with names, and then have expression on those names.
So if you have query named t for matching some terms, and then b for some more comples bool query, you could just do t * 5 + b. I know that script kinda allows something similar, but all this importing values into script thing is just one more annoyance compared to what would be possible with such "named predicates".

@rjernst
Copy link
Member

rjernst commented Oct 24, 2019

importing values into script

Could you elaborate on what "importing" means here?

I have lots of filters with names

By this you mean you are tagging your filters with _name? And you then want to access the tags that matched each doc so you can decide how to calculate the score?

@morphles
Copy link

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-score-query.html
The "params" parameter (which as far as I understand is needed since script is compiled and if you would just place your numbers directly in script, well next time script needs recompile or something like that, I haven't used it much, but when I did that part also seemed fairly annoying, while query DSL is... well it just works, no thinking about caching or compiling).
For names, yes, I have filters with _name (though as far as I understand you can have _name basically in any query). So if one then could use something like (given expression is of course totally random here)

"sort" : [
        { "names_expression" : "name_of_some_used_query1 * 5 + (other_name + name3)^3 / 2"},
        { "post_date" : {"order" : "asc"}},
]

Granted this means the expression lang and available functions need to be defined, which might be close to scripting. Still, I think such would allow comparatively super easy and succinct use of fairly customizable and flexible scoring (and would solve one nasty issue with function_score - if you need score value of some query in multiple places, well, it ain't gona be all that fun with function score, while with this just repeat name; ofc same is available in script :) ).

@jtibshirani
Copy link
Contributor

A note that I added short analysis of function_score vs. script_score performance here: #25913 (comment)

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this issue Jun 17, 2022
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](elastic#27588 (comment))
 [compound_query](elastic#51967)
 [first_query](elastic#52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to elastic#42811
Closes elastic#71934
@mayya-sharipova
Copy link
Contributor

mayya-sharipova commented Jun 17, 2022

Closing this issue, as we have an alternative proposal for first query that would allow us to deprecate function score if this proposal is adopted and implemented.

mayya-sharipova added a commit that referenced this issue Jun 17, 2022
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](#27588 (comment))
 [compound_query](#51967)
 [first_query](#52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to #42811
Closes #71934
mayya-sharipova added a commit that referenced this issue Jun 17, 2022
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](#27588 (comment))
 [compound_query](#51967)
 [first_query](#52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to #42811
Closes #71934
mayya-sharipova added a commit that referenced this issue Jun 17, 2022
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](#27588 (comment))
 [compound_query](#51967)
 [first_query](#52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to #42811
Closes #71934
@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>deprecation >enhancement :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests