-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename ranking evaluation quality_level
to metric_score
#32168
Conversation
The notion of "quality" seems to be a somewhat overloaded term in the search ranking evaluation context. Its usually used to decribe certain levels of "good" vs. "bad" of a seach result wrt. the user needs. We currently report the result of the evaluation metric calculation as `quality_level` which I find a bit missleading now. This changes it to something more neutral like `metric_score`.
Pinging @elastic/es-search-aggs |
@@ -122,7 +122,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
return builder; | |||
} | |||
|
|||
private static final ParseField QUALITY_LEVEL_FIELD = new ParseField("quality_level"); | |||
static final ParseField METRIC_SCORE_FIELD = new ParseField("metric_score"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative name I though about briefly was "evaluation_score", not sure if that sounds any better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM! +1 for removing quality_level
.
metric_score
sounds very good, as all of these are metrics.
Wikipedia also calls them "measure" or "evaluation measure", so another alternative can be measure_value
.
@mayya-sharipova thanks for the review. So would you prefer "evaluation_measure" or "measure_value" to "metric_score"? I just looked up some usages of "evaluation measure" and it seems to be used more to decribe the method used to evaluate something rather than the value or score of a particular thing (as in "Accuracy is a commonly used On the other hand we already use "metric" in the request, and refer to all the evaluation method implementations as "metrics" at the moment. I'm on the fence, will give it another though before merging. |
@cbuescher Thanks for thoroughly considering possible options. I think your suggestion of |
The notion of "quality" is an overloaded term in the search ranking evaluation context. Its usually used to decribe certain levels of "good" vs. "bad" of a seach result with respect to the users information need. We currently report the result of the ranking evaluation as `quality_level` which is a bit missleading. This changes the response parameter name to `metric_score` which fits better.
* 6.x: Security: revert to old way of merging automata (#32254) Fix a test bug in RangeQueryBuilderTests introduced in the field aliases backport. Introduce Application Privileges with support for Kibana RBAC (#32309) Undo a debugging change that snuck in during the field aliases merge. [test] port linux package packaging tests (#31943) Painless: Update More Methods to New Naming Scheme (#32305) Tribe: Add error with secure settings copied to tribe (#32298) Add V_6_3_3 version constant Add ERR to ranking evaluation documentation (#32314) [DOCS] Added link to 6.3.2 RNs [DOCS] Updates 6.3.2 release notes with PRs from ml-cpp repo (#32334) [Kerberos] Add Kerberos authentication support (#32263) [ML] Extract persistent task methods from MlMetadata (#32319) Backport - Add Snapshots Status API to High Level Rest Client (#32295) Make release notes ignore the `>test-failure` label. (#31309) [DOCS] Adds release highlights for search for 6.4 (#32095) Allow Integ Tests to run in a FIPS-140 JVM (#32316) Add support for field aliases to 6.x. (#32184) Register ERR metric with NamedXContentRegistry (#32320) fixes broken build for third-party-tests (#32315) Relates #31918 / Closes infra/issues/6085 [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280) Rest HL client: Add put watch action (#32026) (#32191) Add WeightedAvg metric aggregation (#31037) Consistent encoder names (#29492) Switch monitoring to new style Requests (#32255) specify subdirs of lib, bin, modules in package (#32253) Rename ranking evaluation `quality_level` to `metric_score` (#32168) Add new permission for JDK11 to load JAAS libraries (#32132) Switch x-pack:core to new style Requests (#32252) Watcher: Store username on watch execution (#31873) Silence SSL reload test that fails on JDK 11 Painless: Clean up add methods in PainlessLookup (#32258) CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185) Fail shard if IndexShard#storeStats runs into an IOException (#32241) Fix `range` queries on `_type` field for singe type indices (#31756) (#32161) AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated Add new fields to monitoring template for Beats state (#32085) (#32273) [TEST] improve REST high-level client naming conventions check (#32244) Check that client methods match API defined in the REST spec (#31825)
* master: Security: revert to old way of merging automata (#32254) Networking: Fix test leaking buffer (#32296) Undo a debugging change that snuck in during the field aliases merge. Painless: Update More Methods to New Naming Scheme (#32305) [TEST] Fix assumeFalse -> assumeTrue in SSLReloadIntegTests Ingest: Support integer and long hex values in convert (#32213) Introduce fips_mode setting and associated checks (#32326) Add V_6_3_3 version constant [DOCS] Removed extraneous callout number. Rest HL client: Add put license action (#32214) Add ERR to ranking evaluation documentation (#32314) Introduce Application Privileges with support for Kibana RBAC (#32309) Build: Shadow x-pack:protocol into x-pack:plugin:core (#32240) [Kerberos] Add Kerberos authentication support (#32263) [ML] Extract persistent task methods from MlMetadata (#32319) Add Restore Snapshot High Level REST API Register ERR metric with NamedXContentRegistry (#32320) fixes broken build for third-party-tests (#32315) Allow Integ Tests to run in a FIPS-140 JVM (#31989) [DOCS] Rollup Caps API incorrectly mentions GET Jobs API (#32280) awaitsfix testRandomClusterStateUpdates [TEST] add version skip to weighted_avg tests Consistent encoder names (#29492) Add WeightedAvg metric aggregation (#31037) Switch monitoring to new style Requests (#32255) Rename ranking evaluation `quality_level` to `metric_score` (#32168) Fix a test bug around nested aggregations and field aliases. (#32287) Add new permission for JDK11 to load JAAS libraries (#32132) Silence SSL reload test that fails on JDK 11 [test] package pre-install java check (#32259) specify subdirs of lib, bin, modules in package (#32253) Switch x-pack:core to new style Requests (#32252) awaitsfix SSLConfigurationReloaderTests Painless: Clean up add methods in PainlessLookup (#32258) Fail shard if IndexShard#storeStats runs into an IOException (#32241) AwaitsFix RecoveryIT#testHistoryUUIDIsGenerated Remove unnecessary warning supressions (#32250) CCE when re-throwing "shard not available" exception in TransportShardMultiGetAction (#32185) Add new fields to monitoring template for Beats state (#32085)
The notion of "quality" seems to be a somewhat overloaded term in the search
ranking evaluation context. Its usually used to decribe certain levels of "good"
vs. "bad" of a seach result wrt. the user needs. We currently report the result
of the evaluation metric calculation as
quality_level
which I find a bitmissleading now. This changes it to something more neutral like
metric_score
.