Add Expected Reciprocal Rank metric #31891

cbuescher · 2018-07-09T09:28:18Z

This change adds Expected Reciprocal Rank (ERR) as a ranking evaluation metric
as descriped in:

Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009).
Expected reciprocal rank for graded relevance.
Proceeding of the 18th ACM Conference on Information and Knowledge Management.
https://doi.org/10.1145/1645953.1646033

ERR is an extension of the classical reciprocal rank to the graded relevance
case and assumes a cascade browsing model. It quantifies the usefulness of a
document at rank i conditioned on the degree of relevance of the items at ranks
less than i. ERR seems to be gain traction as an alternative to (n)DCG, so it
seems like a good metric to support. Also ERR seems to be the default optimization
metric used for training in RankLib, a widely used learning to rank library.

Relates to #29653

elasticmachine · 2018-07-09T09:28:19Z

Pinging @elastic/es-search-aggs

This change adds Expected Reciprocal Rank (ERR) as a ranking evaluation metric as descriped in: Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. Proceeding of the 18th ACM Conference on Information and Knowledge Management. https://doi.org/10.1145/1645953.1646033 ERR is an extension of the classical reciprocal rank to the graded relevance case and assumes a cascade browsing model. It quantifies the usefulness of a document at rank `i` conditioned on the degree of relevance of the items at ranks less than `i`. ERR seems to be gain traction as an alternative to (n)DCG, so it seems like a good metric to support. Also ERR seems to be the default optimization metric used for training in RankLib, a widely used learning to rank library. Relates to elastic#29653

mayya-sharipova

Thanks @cbuescher! Nice implementation and tests. Just very small editing comments.

I was wondering if we also want to add documentation on this metric to rank-eval.asciidoc?

mayya-sharipova · 2018-07-10T18:17:56Z

modules/rank-eval/src/main/java/org/elasticsearch/index/rankeval/ExpectedReciprocalRank.java

+import static org.elasticsearch.index.rankeval.EvaluationMetric.joinHitsWithRatings;
+
+/**
+ * Implemention of the Expected Reciprocal Rank metric described in:<p>


implementation?

mayya-sharipova · 2018-07-10T18:55:43Z

modules/rank-eval/src/main/java/org/elasticsearch/index/rankeval/ExpectedReciprocalRank.java

+
+    private final double two_pow_maxRelevance;
+
+    public static final String NAME = "err";


"err" doesn't sound like "error"? Have you though of using a diff name here and in code? It is fine to keep this name as well if you think so.

I can see the problem. On the other hand, the metric is abreviated like this in the paper itself and in most places citing it. I'm on the fence here. So far we have:

"precision" -> precision at K

"dcg" -> discounted cumulative gain

"mean_reciprocal_rank" -> well... mean reciprocal rank

I kind of like the shorter names, they are only ids in the end. But I can get this might be a source of confusion. I would go with the full "expected_reciprocal_rank" then, a bit long but one is not supposed to type it all the time. Will give it some additional thought though before changing.

Decided to go with "expected_reciprocal_rank" now, its long but only four chars more than "mean_reciprocal_rank", so what the heck...

mayya-sharipova · 2018-07-10T18:56:49Z

...es/rank-eval/src/test/java/org/elasticsearch/index/rankeval/ExpectedReciprocalRankTests.java

+        for (int i = 0; i < relevanceRatings.length; i++) {
+            rated.add(new RatedDocument("index", Integer.toString(i), relevanceRatings[i]));
+            hits[i] = new SearchHit(i, Integer.toString(i), new Text("type"), Collections.emptyMap());
+            hits[i].shard(new SearchShardTarget("testnode", new Index("index", "uuid"), 0, null));


if SearchShardTarget is the same for all hits, would it be reasonable to create it outside of for loop? here and in the following test?

This change adds Expected Reciprocal Rank (ERR) as a ranking evaluation metric as descriped in: Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. Proceeding of the 18th ACM Conference on Information and Knowledge Management. https://doi.org/10.1145/1645953.1646033 ERR is an extension of the classical reciprocal rank to the graded relevance case and assumes a cascade browsing model. It quantifies the usefulness of a document at rank `i` conditioned on the degree of relevance of the items at ranks less than `i`. ERR seems to be gain traction as an alternative to (n)DCG, so it seems like a good metric to support. Also ERR seems to be the default optimization metric used for training in RankLib, a widely used learning to rank library. Relates to #29653

* master: [TEST] Mute SlackMessageTests.testTemplateRender Docs: Explain closing the high level client [ML] Re-enable memory limit integration tests (#31328) [test] disable packaging tests for suse boxes Add nio transport to security plugin (#31942) XContentTests : Insert random fields at random positions (#30867) Force execution of fetch tasks (#31974) Fix unreachable error condition in AmazonS3Fixture (#32005) Tests: Fix SearchFieldsIT.testDocValueFields (#31995) Add Expected Reciprocal Rank metric (#31891) [ML] Get ForecastRequestStats doc in RestoreModelSnapshotIT (#31973) SQL: Add support for single parameter text manipulating functions (#31874) [ML] Ensure immutability of MlMetadata (#31957) Tests: Mute SearchFieldsIT.testDocValueFields() muted tests due to #31940 Work around reported problem in eclipse (#31960) Move build integration tests out of :buildSrc project (#31961) Tests: Remove use of joda time in some tests (#31922) [Test] Reactive 3rd party tests on CI (#31919) SQL: Support for escape sequences (#31884) SQL: HAVING clause should accept only aggregates (#31872) Docs: fix typo in datehistogram (#31972) Switch url repository rest tests to new style requests (#31944) Switch reindex tests to new style requests (#31941) Docs: Added note about cloud service to installation and getting started [DOCS] Removes alternative docker pull example (#31934) Add Snapshots Status API to High Level Rest Client (#31515) ingest: date_index_name processor template resolution (#31841) Test: fix null failure in watcher test (#31968) Switch test framework to new style requests (#31939) Switch low level rest tests to new style Requests (#31938) Switch high level rest tests to new style requests (#31937) [ML] Mute test failing due to Java 11 date time format parsing bug (#31899) [TEST] Mute SlackMessageTests.testTemplateRender Fix assertIngestDocument wrongfully passing (#31913) Remove unused reference to filePermissionsCache (#31923) rolling upgrade should use a replica to prevent relocations while running a scroll HLREST: Bundle the x-pack protocol project (#31904) Increase logging level for testStressMaybeFlush Added lenient flag for synonym token filter (#31484) [X-Pack] Beats centralized management: security role + licensing (#30520) HLRest: Move xPackInfo() to xPack().info() (#31905) Docs: add security delete role to api call table (#31907) [test] port archive distribution packaging tests (#31314) Watcher: Slack message empty text (#31596) [ML] Mute failing DetectionRulesIT.testCondition() test Fix broken NaN check in MovingFunctions#stdDev() (#31888) Date: Add DateFormatters class that uses java.time (#31856) [ML] Switch native QA tests to a 3 node cluster (#31757) Change trappy float comparison (#31889) Fix building AD URL from domain name (#31849) Add opaque_id to audit logging (#31878) re-enable backcompat tests add support for is_write_index in put-alias body parsing (#31674) Improve release notes script (#31833) [DOCS] Fix broken link in painless example Handle missing values in painless (#30975) Remove the ability to index or query context suggestions without context (#31007) Ingest: Enable Templated Fieldnames in Rename (#31690) [Docs] Fix typo in the Rollup API Quick Reference (#31855) Ingest: Add ignore_missing option to RemoveProc (#31693) Add template config for Beat state to X-Pack Monitoring (#31809) Watcher: Add ssl.trust email account setting (#31684) Remove link to oss-MSI (#31844) Painless: Restructure Definition/Whitelist (#31879) HLREST: Add x-pack-info API (#31870)

* 6.x: Force execution of fetch tasks (#31974) [TEST] Mute SlackMessageTests.testTemplateRender Docs: Explain closing the high level client [test] disable java packaging tests for suse XContentTests : Insert random fields at random positions (#30867) Add Get Snapshots High Level REST API (#31980) Fix unreachable error condition in AmazonS3Fixture (#32005) [6.x][ML] Ensure immutability of MlMetadata (#31994) Add Expected Reciprocal Rank metric (#31891) SQL: Add support for single parameter text manipulating functions (#31874) muted tests due to #31940 Work around reported problem in eclipse (#31960) Move build integration tests out of :buildSrc project (#31961) [Test] Reactive 3rd party tests on CI (#31919) Fix assertIngestDocument wrongfully passing (#31913) (#31951) SQL: Support for escape sequences (#31884) SQL: HAVING clause should accept only aggregates (#31872) Docs: fix typo in datehistogram (#31972) Switch url repository rest tests to new style requests (#31944) Switch reindex tests to new style requests (#31941) Switch test framework to new style requests (#31939) Docs: Added note about cloud service to installation and getting started [DOCS] Removes alternative docker pull example (#31934) ingest: date_index_name processor template resolution (#31841) Test: fix null failure in watcher test (#31968) Watcher: Slack message empty text (#31596) Switch low level rest tests to new style Requests (#31938) Switch high level rest tests to new style requests (#31937) HLREST: Bundle the x-pack protocol project (#31904) [ML] Mute test failing due to Java 11 date time format parsing bug (#31899) Increase logging level for testStressMaybeFlush rolling upgrade should use a replica to prevent relocations while running a scroll [test] port archive distribution packaging tests (#31314) HLRest: Move xPackInfo() to xPack().info() (#31905) Increase logging level for :qa:rolling-upgrade Backport: Add template config for Beat state to X-Pack Monitoring (#31809) (#31893) Fix building AD URL from domain name (#31849) Fix broken NaN check in MovingFunctions#stdDev() (#31888) Change trappy float comparison (#31889) Add opaque_id to audit logging (#31878) add support for is_write_index in put-alias body parsing (#31674) Ingest: Enable Templated Fieldnames in Rename (#31690) (#31896) Ingest: Add ignore_missing option to RemoveProc (#31693) (#31892) [Docs] Fix typo in the Rollup API Quick Reference (#31855) Watcher: Add ssl.trust email account setting (#31684) [PkiRealm] Invalidate cache on role mappings change (#31510) [Security] Check auth scheme case insensitively (#31490) HLREST: Add x-pack-info API (#31870) Remove link to oss-MSI (#31844) Painless: Restructure Definition/Whitelist (#31879)

cbuescher added >enhancement review :Search Relevance/Ranking Scoring, rescoring, rank evaluation. v7.0.0 v6.4.0 labels Jul 9, 2018

cbuescher mentioned this pull request Jul 9, 2018

Investigate other ranking evaluation metrics #29653

Closed

cbuescher requested a review from mayya-sharipova July 9, 2018 09:33

cbuescher force-pushed the add-ERR branch from 71e1638 to e7d2f8f Compare July 9, 2018 09:38

Christoph Büscher added 2 commits July 10, 2018 10:58

iter

4e8101b

cbuescher force-pushed the add-ERR branch from e7d2f8f to 4e8101b Compare July 10, 2018 09:57

mayya-sharipova approved these changes Jul 10, 2018

View reviewed changes

Christoph Büscher added 3 commits July 11, 2018 17:54

Merge branch 'master' into add-ERR

e3eb461

iter

8ee4b0f

Change metric id

cd348a3

cbuescher merged commit 4ae4ac0 into elastic:master Jul 12, 2018

Mpdreamz mentioned this pull request Sep 25, 2018

[meta] 6.4.0 release elastic/elasticsearch-net#3397

Closed

89 tasks

Mpdreamz mentioned this pull request Oct 22, 2018

[meta] 6.5.0 Release elastic/elasticsearch-net#3457

Closed

codebrain mentioned this pull request Jan 28, 2019

[meta] 6.6.0 Release elastic/elasticsearch-net#3552

Closed

48 tasks

jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Expected Reciprocal Rank metric #31891

Add Expected Reciprocal Rank metric #31891

cbuescher commented Jul 9, 2018

elasticmachine commented Jul 9, 2018

mayya-sharipova left a comment

mayya-sharipova Jul 10, 2018

mayya-sharipova Jul 10, 2018

cbuescher Jul 11, 2018

cbuescher Jul 12, 2018

mayya-sharipova Jul 10, 2018


		private final double two_pow_maxRelevance;

		public static final String NAME = "err";

Add Expected Reciprocal Rank metric #31891

Add Expected Reciprocal Rank metric #31891

Conversation

cbuescher commented Jul 9, 2018

elasticmachine commented Jul 9, 2018

mayya-sharipova left a comment

Choose a reason for hiding this comment

mayya-sharipova Jul 10, 2018

Choose a reason for hiding this comment

mayya-sharipova Jul 10, 2018

Choose a reason for hiding this comment

cbuescher Jul 11, 2018

Choose a reason for hiding this comment

cbuescher Jul 12, 2018

Choose a reason for hiding this comment

mayya-sharipova Jul 10, 2018

Choose a reason for hiding this comment