Rename ranking evaluation `quality_level` to `metric_score` #32168

cbuescher · 2018-07-18T13:24:41Z

The notion of "quality" seems to be a somewhat overloaded term in the search
ranking evaluation context. Its usually used to decribe certain levels of "good"
vs. "bad" of a seach result wrt. the user needs. We currently report the result
of the evaluation metric calculation as quality_level which I find a bit
missleading now. This changes it to something more neutral like metric_score.

The notion of "quality" seems to be a somewhat overloaded term in the search ranking evaluation context. Its usually used to decribe certain levels of "good" vs. "bad" of a seach result wrt. the user needs. We currently report the result of the evaluation metric calculation as `quality_level` which I find a bit missleading now. This changes it to something more neutral like `metric_score`.

elasticmachine · 2018-07-18T13:24:43Z

Pinging @elastic/es-search-aggs

cbuescher · 2018-07-18T13:26:07Z

modules/rank-eval/src/main/java/org/elasticsearch/index/rankeval/EvalQueryQuality.java

@@ -122,7 +122,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
        return builder;
    }

-    private static final ParseField QUALITY_LEVEL_FIELD = new ParseField("quality_level");
+    static final ParseField METRIC_SCORE_FIELD = new ParseField("metric_score");


An alternative name I though about briefly was "evaluation_score", not sure if that sounds any better.

mayya-sharipova

Changes LGTM! +1 for removing quality_level.
metric_score sounds very good, as all of these are metrics.
Wikipedia also calls them "measure" or "evaluation measure", so another alternative can be measure_value.

cbuescher · 2018-07-18T20:29:56Z

@mayya-sharipova thanks for the review. So would you prefer "evaluation_measure" or "measure_value" to "metric_score"? I just looked up some usages of "evaluation measure" and it seems to be used more to decribe the method used to evaluate something rather than the value or score of a particular thing (as in "Accuracy is a commonly used
evaluation measure in machine learning classification"). I think in this case we are naming the value/score. Manning/Raghavan/Schütze seem to mostly use "measure" in their "Information Retrieval" book, but I also found some places where e.g. "R-precsision" was calles a "metric". I'll give "measure_value" a second thought, but maybe you have preferences as well?

On the other hand we already use "metric" in the request, and refer to all the evaluation method implementations as "metrics" at the moment. I'm on the fence, will give it another though before merging.

mayya-sharipova · 2018-07-20T21:10:50Z

@cbuescher Thanks for thoroughly considering possible options. I think your suggestion of metric_score is the best, especially if we also use it in a request.

The notion of "quality" is an overloaded term in the search ranking evaluation context. Its usually used to decribe certain levels of "good" vs. "bad" of a seach result with respect to the users information need. We currently report the result of the ranking evaluation as `quality_level` which is a bit missleading. This changes the response parameter name to `metric_score` which fits better.

* 6.x: Security: revert to old way of merging automata (#32254) Fix a test bug in RangeQueryBuilderTests introduced in the field aliases backport. Introduce Application Privileges with support for Kibana RBAC (#32309) Undo a debugging change that snuck in during the field aliases merge. [test] port linux package packaging tests (#31943) Painless: Update More Methods to New Naming Scheme (#32305) Tribe: Add error with secure settings copied to tribe (#32298) Add V_6_3_3 version constant Add ERR to ranking evaluation documentation (#32314) [DOCS] Added link to 6.3.2 RNs [DOCS] Updates 6.3.2 release notes with PRs from ml-cpp repo (#32334) [Kerberos] Add Kerberos authentication support (#32263) [ML] Extract persistent task methods from MlMetadata (#32319) Backport - Add Snapshots Status API to High Level Rest Client (#32295) Make release notes ignore the `>test-failure` label. (#31309) [DOCS] Adds release highlights for search for 6.4 (#32095) Allow Integ Tests to run in a FIPS-140 JVM (#32316) Add support for field aliases to 6.x. (#32184) Register ERR metric with NamedXContentRegistry (#32320) fixes broken build for third-party-tests (#32315) Relates #31918 / Closes infra/issues/6085 [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280) Rest HL client: Add put watch action (#32026) (#32191) Add WeightedAvg metric aggregation (#31037) Consistent encoder names (#29492) Switch monitoring to new style Requests (#32255) specify subdirs of lib, bin, modules in package (#32253) Rename ranking evaluation `quality_level` to `metric_score` (#32168) Add new permission for JDK11 to load JAAS libraries (#32132) Switch x-pack:core to new style Requests (#32252) Watcher: Store username on watch execution (#31873) Silence SSL reload test that fails on JDK 11 Painless: Clean up add methods in PainlessLookup (#32258) CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185) Fail shard if IndexShard#storeStats runs into an IOException (#32241) Fix `range` queries on `_type` field for singe type indices (#31756) (#32161) AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated Add new fields to monitoring template for Beats state (#32085) (#32273) [TEST] improve REST high-level client naming conventions check (#32244) Check that client methods match API defined in the REST spec (#31825)

* master: Security: revert to old way of merging automata (#32254) Networking: Fix test leaking buffer (#32296) Undo a debugging change that snuck in during the field aliases merge. Painless: Update More Methods to New Naming Scheme (#32305) [TEST] Fix assumeFalse -> assumeTrue in SSLReloadIntegTests Ingest: Support integer and long hex values in convert (#32213) Introduce fips_mode setting and associated checks (#32326) Add V_6_3_3 version constant [DOCS] Removed extraneous callout number. Rest HL client: Add put license action (#32214) Add ERR to ranking evaluation documentation (#32314) Introduce Application Privileges with support for Kibana RBAC (#32309) Build: Shadow x-pack:protocol into x-pack:plugin:core (#32240) [Kerberos] Add Kerberos authentication support (#32263) [ML] Extract persistent task methods from MlMetadata (#32319) Add Restore Snapshot High Level REST API Register ERR metric with NamedXContentRegistry (#32320) fixes broken build for third-party-tests (#32315) Allow Integ Tests to run in a FIPS-140 JVM (#31989) [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280) awaitsfix testRandomClusterStateUpdates [TEST] add version skip to weighted_avg tests Consistent encoder names (#29492) Add WeightedAvg metric aggregation (#31037) Switch monitoring to new style Requests (#32255) Rename ranking evaluation `quality_level` to `metric_score` (#32168) Fix a test bug around nested aggregations and field aliases. (#32287) Add new permission for JDK11 to load JAAS libraries (#32132) Silence SSL reload test that fails on JDK 11 [test] package pre-install java check (#32259) specify subdirs of lib, bin, modules in package (#32253) Switch x-pack:core to new style Requests (#32252) awaitsfix SSLConfigurationReloaderTests Painless: Clean up add methods in PainlessLookup (#32258) Fail shard if IndexShard#storeStats runs into an IOException (#32241) AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated Remove unnecessary warning supressions (#32250) CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185) Add new fields to monitoring template for Beats state (#32085)

cbuescher added >enhancement review :Search Relevance/Ranking Scoring, rescoring, rank evaluation. v7.0.0 v6.4.0 labels Jul 18, 2018

cbuescher commented Jul 18, 2018

View reviewed changes

mayya-sharipova self-requested a review July 18, 2018 16:57

mayya-sharipova approved these changes Jul 18, 2018

View reviewed changes

Merge branch 'master' into rename-qualityLevel

95579fe

Christoph Büscher added 3 commits July 23, 2018 15:57

Merge branch 'master' into rename-qualityLevel

70ee8e9

iter

d9c54d1

Merge branch 'master' into rename-qualityLevel

6853ad2

cbuescher merged commit fe6bb75 into elastic:master Jul 23, 2018

Mpdreamz mentioned this pull request Sep 25, 2018

[meta] 6.4.0 release elastic/elasticsearch-net#3397

Closed

89 tasks

Mpdreamz mentioned this pull request Oct 22, 2018

[meta] 6.5.0 Release elastic/elasticsearch-net#3457

Closed

codebrain mentioned this pull request Jan 28, 2019

[meta] 6.6.0 Release elastic/elasticsearch-net#3552

Closed

48 tasks

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename ranking evaluation `quality_level` to `metric_score` #32168

Rename ranking evaluation `quality_level` to `metric_score` #32168

cbuescher commented Jul 18, 2018

elasticmachine commented Jul 18, 2018

cbuescher Jul 18, 2018

mayya-sharipova left a comment

cbuescher commented Jul 18, 2018

mayya-sharipova commented Jul 20, 2018

Rename ranking evaluation quality_level to metric_score #32168

Rename ranking evaluation quality_level to metric_score #32168

Conversation

cbuescher commented Jul 18, 2018

elasticmachine commented Jul 18, 2018

cbuescher Jul 18, 2018

Choose a reason for hiding this comment

mayya-sharipova left a comment

Choose a reason for hiding this comment

cbuescher commented Jul 18, 2018

mayya-sharipova commented Jul 20, 2018

Rename ranking evaluation `quality_level` to `metric_score` #32168

Rename ranking evaluation `quality_level` to `metric_score` #32168