Avoid negative scores with cross_fields type #89016

jtibshirani · 2022-08-01T22:10:54Z

The cross_fields scoring type can produce negative scores when some documents
are missing fields. When blending term document frequencies, we take the maximum
document frequency across all fields. If one field appears in fewer documents
than another, this means that its IDF can become negative. This is because IDF
is calculated as Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))

This change adjusts the docFreq for each field to Math.min(docCount, docFreq)
so that the IDF can never become negative. It makes sense that the term document
frequency should never exceed the number of documents containing the field.

Fixes #44700

elasticsearchmachine · 2022-08-01T22:11:18Z

Hi @jtibshirani, I've created a changelog YAML for you.

elasticsearchmachine · 2022-08-01T22:53:21Z

Hi @jtibshirani, I've updated the changelog YAML for you.

The cross_fields scoring type can produce negative scores when some documents are missing fields. When blending term document frequencies, we take the maximum document frequency across all fields. If one field appears in fewer documents than another, this means that its IDF can become negative. This is because IDF is calculated as `Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))` This change adjusts the docFreq for each field to `Math.min(docCount, docFreq)` so that the IDF can never become negative. It makes sense that the term document frequency should never exceed the number of documents containing the field.

elasticsearchmachine · 2022-08-04T17:36:46Z

Pinging @elastic/es-search (Team:Search)

elasticsearchmachine · 2022-08-04T17:36:46Z

Hi @jtibshirani, I've created a changelog YAML for you.

jtibshirani · 2022-08-04T17:48:30Z

server/src/main/java/org/elasticsearch/lucene/queries/BlendedTermQuery.java

@@ -148,7 +148,10 @@ protected int compare(int i, int j) {
            if (prev > current) {
                actualDf++;
            }
-            contexts[i] = ctx = adjustDF(reader.getContext(), ctx, Math.min(maxDoc, actualDf));
+
+            int docCount = reader.getDocCount(terms[i].field());


This query's score calculation is complex and can be hard to reason about. I liked this change because it doesn't affect the non-broken case at all. It only changes the score if docCount < docFreq, which would have resulted in a negative IDF and a broken score anyways.

romseygeek

I think @jpountz will want to have a look as well but this makes sense to me.

docs/reference/query-dsl/multi-match-query.asciidoc

jtibshirani · 2022-08-15T18:25:18Z

Thanks for the review @romseygeek.

jtibshirani · 2022-08-29T20:46:23Z

@jpountz would you be up for reviewing this? Thanks in advance!

jpountz · 2022-08-30T08:07:08Z

Thanks @jtibshirani for the ping. This looks like the smallest change we could do that would fix the bug, but I worry that it goes against the intent of cross_fields. cross_fields's documentation says Treats fields with the same analyzer as though they were one big field and my understanding is that the implementation tries to achieve this by forging term statistics in a way that gives terms (almost) the same IDF on all fields. If we start adjusting docFreq then we're possibly giving very different IDFs to different fields for the same term.

Would it make more sense to adjust docCount to take the max across all fields, which should in-turn help guarantee that docFreq is always less than docCount? I know that different fields may be getting very different IDFs today because they have different docCounts but my understanding is that it is due to the fact that the cross_fields implementation overlooked sparse fields and it was not intentional.

jtibshirani · 2022-08-30T18:03:15Z

@jpountz thanks for looking. I considered adjusting docCount instead, but I could not find a way to change that value, since it is not part of TermStates and is loaded from IndexReader. Only docFreq can be adjusted... here's the javadoc from TermQuery:

  /**
   * Expert: constructs a TermQuery that will use the provided docFreq instead of looking up the
   * docFreq against the searcher.
   */
  public TermQuery(Term t, TermStates states) { ... }

We could attempt a larger refactor, but it seems risky and not the right investment.

As you mentioned, we overlooked the sparse case so the statement "Treats fields with the same analyzer as though they were one big field" was never accurate. I think our options are (1) consider a bigger, principled change to fix the sparse case, (2) throw an exception in the sparse case saying it's not supported, or (3) make a targeted fix like this one. I would vote for (3) as the most user friendly and practical option.

jpountz

Thanks @jtibshirani for the additional context. Now that I better understand what it would take to implement a correct fix, I agree with your suggestion of going with option 3, essentially further redirecting users from the cross_fields mode of multi_match to the combined_fields query.

jpountz · 2022-09-06T14:55:41Z

docs/reference/query-dsl/multi-match-query.asciidoc

-<<query-dsl-combined-fields-query,`combined_fields`>> query, which is also
-term-centric but combines field statistics in a more robust way.
+WARNING: The `cross_fields` type blends field statistics in a complex way that
+can be hard to interpret. You should consider the


Given the discussion on fixing docCount vs. docFreq, I wonder if we should make this warning stronger and point out that scores may not only be hard to interpret (which covers e.g. the way it tries to prefer the most likely field) but also incorrect in case of fields that have different statistics.

The cross_fields scoring type can produce negative scores when some documents are missing fields. When blending term document frequencies, we take the maximum document frequency across all fields. If one field appears in fewer documents than another, this means that its IDF can become negative. This is because IDF is calculated as `Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))` This change adjusts the docFreq for each field to `Math.min(docCount, docFreq)` so that the IDF can never become negative. It makes sense that the term document frequency should never exceed the number of documents containing the field.

In #89016 we adjusted the `cross_fields` scoring formula to prevent negative scores. This fix accidentally dropped another important fix that was added in #41938. Specifically, we need to make sure to take the minimum between the document frequency (`actualDf`) and the minimum total term frequency (`minTTF`). Otherwise, we can produce invalid term statistics where the total term frequency is less than the document frequency. Fixes #90275

In elastic#89016 we adjusted the `cross_fields` scoring formula to prevent negative scores. This fix accidentally dropped another important fix that was added in elastic#41938. Specifically, we need to make sure to take the minimum between the document frequency (`actualDf`) and the minimum total term frequency (`minTTF`). Otherwise, we can produce invalid term statistics where the total term frequency is less than the document frequency. Fixes elastic#90275

In #89016 we adjusted the `cross_fields` scoring formula to prevent negative scores. This fix accidentally dropped another important fix that was added in #41938. Specifically, we need to make sure to take the minimum between the document frequency (`actualDf`) and the minimum total term frequency (`minTTF`). Otherwise, we can produce invalid term statistics where the total term frequency is less than the document frequency. Fixes #90275

In #89016 we adjusted the `cross_fields` scoring formula to prevent negative scores. This fix accidentally dropped another important fix that was added in document frequency (`actualDf`) and the minimum total term frequency (`minTTF`). Otherwise, we can produce invalid term statistics where the total term frequency is less than the document frequency. Fixes #90275

jtibshirani added >bug :Search Relevance/Ranking Scoring, rescoring, rank evaluation. v8.4.0 v8.5.0 labels Aug 1, 2022

jtibshirani force-pushed the cross-fields-score branch from 40d9164 to 78d81e7 Compare August 4, 2022 17:21

jtibshirani marked this pull request as ready for review August 4, 2022 17:36

elasticsearchmachine added the Team:Search Meta label for search team label Aug 4, 2022

Update docs/changelog/89016.yaml

18d6573

jtibshirani added the v7.17.6 label Aug 4, 2022

jtibshirani commented Aug 4, 2022

View reviewed changes

romseygeek approved these changes Aug 4, 2022

View reviewed changes

docs/reference/query-dsl/multi-match-query.asciidoc Outdated Show resolved Hide resolved

Fix docs typo

50a2790

jtibshirani requested a review from jpountz August 15, 2022 18:25

jtibshirani added v8.4.1 and removed v8.4.0 labels Aug 15, 2022

mark-vieira added v7.17.7 and removed v7.17.6 labels Aug 24, 2022

Merge remote-tracking branch 'upstream/main' into cross-fields-score

04b1d77

jpountz approved these changes Sep 6, 2022

View reviewed changes

jtibshirani added 2 commits September 6, 2022 11:33

Merge remote-tracking branch 'upstream/main' into cross-fields-score

4669432

Strengthen the warning around using cross_fields

6853a91

jtibshirani added v8.4.2 and removed v8.4.1 labels Sep 6, 2022

jtibshirani merged commit 3c1b070 into elastic:main Sep 6, 2022

jtibshirani deleted the cross-fields-score branch September 6, 2022 20:02

This was referenced Sep 6, 2022

Cross-field type creates broken scores when not all fields have the same docCount #44700

Closed

combined_fields cannot be used be in query_string query. Can we add this functionality #75909

Closed

jtibshirani mentioned this pull request Sep 7, 2022

Avoid negative scores with cross_fields type #89843

Merged

jtibshirani mentioned this pull request Sep 22, 2022

Ensure cross_fields always uses valid term statistics #90278

Merged

This was referenced Sep 23, 2022

Ensure cross_fields always uses valid term statistics (#90278) #90314

Merged

Ensure cross_fields always uses valid term statistics (#90278) #90316

Merged

snikoyo mentioned this pull request Jun 1, 2023

[BUG] function score query returned an invalid (negative) score with multi match cross fields query opensearch-project/OpenSearch#7860

Closed

This was referenced Aug 24, 2023

[BUG] Cross Fields Query can Generate Negative Score opensearch-project/OpenSearch#9542

Closed

Avoid negative scores in multi_match opensearch-project/OpenSearch#9571

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid negative scores with cross_fields type #89016

Avoid negative scores with cross_fields type #89016

jtibshirani commented Aug 1, 2022 •

edited

Loading

elasticsearchmachine commented Aug 1, 2022

elasticsearchmachine commented Aug 1, 2022

elasticsearchmachine commented Aug 4, 2022

elasticsearchmachine commented Aug 4, 2022

jtibshirani Aug 4, 2022

romseygeek left a comment

jtibshirani commented Aug 15, 2022

jtibshirani commented Aug 29, 2022

jpountz commented Aug 30, 2022

jtibshirani commented Aug 30, 2022 •

edited

Loading

jpountz left a comment

jpountz Sep 6, 2022

jtibshirani Sep 6, 2022

Avoid negative scores with cross_fields type #89016

Avoid negative scores with cross_fields type #89016

Conversation

jtibshirani commented Aug 1, 2022 • edited Loading

elasticsearchmachine commented Aug 1, 2022

elasticsearchmachine commented Aug 1, 2022

elasticsearchmachine commented Aug 4, 2022

elasticsearchmachine commented Aug 4, 2022

jtibshirani Aug 4, 2022

Choose a reason for hiding this comment

romseygeek left a comment

Choose a reason for hiding this comment

jtibshirani commented Aug 15, 2022

jtibshirani commented Aug 29, 2022

jpountz commented Aug 30, 2022

jtibshirani commented Aug 30, 2022 • edited Loading

jpountz left a comment

Choose a reason for hiding this comment

jpountz Sep 6, 2022

Choose a reason for hiding this comment

jtibshirani Sep 6, 2022

Choose a reason for hiding this comment

jtibshirani commented Aug 1, 2022 •

edited

Loading

jtibshirani commented Aug 30, 2022 •

edited

Loading