Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename ranking evaluation quality_level to metric_score #32168

Merged
merged 5 commits into from
Jul 23, 2018

Conversation

cbuescher
Copy link
Member

The notion of "quality" seems to be a somewhat overloaded term in the search
ranking evaluation context. Its usually used to decribe certain levels of "good"
vs. "bad" of a seach result wrt. the user needs. We currently report the result
of the evaluation metric calculation as quality_level which I find a bit
missleading now. This changes it to something more neutral like metric_score.

The notion of "quality" seems to be a somewhat overloaded term in the search
ranking evaluation context. Its usually used to decribe certain levels of "good"
vs. "bad" of a seach result wrt. the user needs. We currently report the result
of the evaluation metric calculation as `quality_level` which I find a bit
missleading now. This changes it to something more neutral like `metric_score`.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@@ -122,7 +122,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
return builder;
}

private static final ParseField QUALITY_LEVEL_FIELD = new ParseField("quality_level");
static final ParseField METRIC_SCORE_FIELD = new ParseField("metric_score");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative name I though about briefly was "evaluation_score", not sure if that sounds any better.

@mayya-sharipova mayya-sharipova self-requested a review July 18, 2018 16:57
Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM! +1 for removing quality_level.
metric_score sounds very good, as all of these are metrics.
Wikipedia also calls them "measure" or "evaluation measure", so another alternative can be measure_value.

@cbuescher
Copy link
Member Author

@mayya-sharipova thanks for the review. So would you prefer "evaluation_measure" or "measure_value" to "metric_score"? I just looked up some usages of "evaluation measure" and it seems to be used more to decribe the method used to evaluate something rather than the value or score of a particular thing (as in "Accuracy is a commonly used
evaluation measure in machine learning classification"). I think in this case we are naming the value/score. Manning/Raghavan/Schütze seem to mostly use "measure" in their "Information Retrieval" book, but I also found some places where e.g. "R-precsision" was calles a "metric". I'll give "measure_value" a second thought, but maybe you have preferences as well?

On the other hand we already use "metric" in the request, and refer to all the evaluation method implementations as "metrics" at the moment. I'm on the fence, will give it another though before merging.

@mayya-sharipova
Copy link
Contributor

@cbuescher Thanks for thoroughly considering possible options. I think your suggestion of metric_score is the best, especially if we also use it in a request.

@cbuescher cbuescher merged commit fe6bb75 into elastic:master Jul 23, 2018
cbuescher pushed a commit that referenced this pull request Jul 23, 2018
The notion of "quality" is an overloaded term in the search ranking evaluation 
context. Its usually used to decribe certain levels of "good" vs. "bad" of a 
seach result with respect to the users information need. We currently report the 
result of the ranking evaluation as `quality_level` which is a bit missleading.
This changes the response parameter name to `metric_score` which fits better.
dnhatn added a commit that referenced this pull request Jul 25, 2018
* 6.x:
  Security: revert to old way of merging automata (#32254)
  Fix a test bug in RangeQueryBuilderTests introduced in the field aliases backport.
  Introduce Application Privileges with support for Kibana RBAC (#32309)
  Undo a debugging change that snuck in during the field aliases merge.
  [test] port linux package packaging tests (#31943)
  Painless: Update More Methods to New Naming Scheme (#32305)
  Tribe: Add error with secure settings copied to tribe (#32298)
  Add V_6_3_3 version constant
  Add ERR to ranking evaluation documentation (#32314)
  [DOCS] Added link to 6.3.2 RNs
  [DOCS] Updates 6.3.2 release notes with PRs from ml-cpp repo (#32334)
  [Kerberos] Add Kerberos authentication support (#32263)
  [ML] Extract persistent task methods from MlMetadata (#32319)
  Backport - Add Snapshots Status API to High Level Rest Client (#32295)
  Make release notes ignore the `>test-failure` label. (#31309)
  [DOCS] Adds release highlights for search for 6.4 (#32095)
  Allow Integ Tests to run in a FIPS-140 JVM (#32316)
  Add support for field aliases to 6.x. (#32184)
  Register ERR metric with NamedXContentRegistry (#32320)
  fixes broken build for third-party-tests (#32315) Relates #31918 / Closes infra/issues/6085
  [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280)
  Rest HL client: Add put watch action (#32026) (#32191)
  Add WeightedAvg metric aggregation (#31037)
  Consistent encoder names (#29492)
  Switch monitoring to new style Requests (#32255)
  specify subdirs of lib, bin, modules in package (#32253)
  Rename ranking evaluation `quality_level` to `metric_score` (#32168)
  Add new permission for JDK11 to load JAAS libraries (#32132)
  Switch x-pack:core to new style Requests (#32252)
  Watcher: Store username on watch execution (#31873)
  Silence SSL reload test that fails on JDK 11
  Painless: Clean up add methods in PainlessLookup (#32258)
  CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185)
  Fail shard if IndexShard#storeStats runs into an IOException (#32241)
  Fix `range` queries on `_type` field for singe type indices (#31756) (#32161)
  AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated
  Add new fields to monitoring template for Beats state (#32085) (#32273)
  [TEST] improve REST high-level client naming conventions check (#32244)
  Check that client methods match API defined in the REST spec (#31825)
dnhatn added a commit that referenced this pull request Jul 25, 2018
* master:
  Security: revert to old way of merging automata (#32254)
  Networking: Fix test leaking buffer (#32296)
  Undo a debugging change that snuck in during the field aliases merge.
  Painless: Update More Methods to New Naming Scheme (#32305)
  [TEST] Fix assumeFalse -> assumeTrue in SSLReloadIntegTests
  Ingest: Support integer and long hex values in convert (#32213)
  Introduce fips_mode setting and associated checks (#32326)
  Add V_6_3_3 version constant
  [DOCS] Removed extraneous callout number.
  Rest HL client: Add put license action (#32214)
  Add ERR to ranking evaluation documentation (#32314)
  Introduce Application Privileges with support for Kibana RBAC (#32309)
  Build: Shadow x-pack:protocol into x-pack:plugin:core (#32240)
  [Kerberos] Add Kerberos authentication support (#32263)
  [ML] Extract persistent task methods from MlMetadata (#32319)
  Add Restore Snapshot High Level REST API
  Register ERR metric with NamedXContentRegistry (#32320)
  fixes broken build for third-party-tests (#32315)
  Allow Integ Tests to run in a FIPS-140 JVM (#31989)
  [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280)
  awaitsfix testRandomClusterStateUpdates
  [TEST] add version skip to weighted_avg tests
  Consistent encoder names (#29492)
  Add WeightedAvg metric aggregation (#31037)
  Switch monitoring to new style Requests (#32255)
  Rename ranking evaluation `quality_level` to `metric_score` (#32168)
  Fix a test bug around nested aggregations and field aliases. (#32287)
  Add new permission for JDK11 to load JAAS libraries (#32132)
  Silence SSL reload test that fails on JDK 11
  [test] package pre-install java check (#32259)
  specify subdirs of lib, bin, modules in package (#32253)
  Switch x-pack:core to new style Requests (#32252)
  awaitsfix SSLConfigurationReloaderTests
  Painless: Clean up add methods in PainlessLookup (#32258)
  Fail shard if IndexShard#storeStats runs into an IOException (#32241)
  AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated
  Remove unnecessary warning supressions (#32250)
  CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185)
  Add new fields to monitoring template for Beats state (#32085)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants