-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add details section for dcg ranking metric #31177
Conversation
While the other two ranking evaluation metrics (precicion and reciprocal rank) already provide a more detailed output for how their score is calculated, the discounted cumulative gain metric (dcg) and its normalized variant are lacking this until now. Its not really clear which level of detail might be useful for debugging and understanding the final metric calculation, but this change adds a `metric_details` section to REST output that contains some information about the evaluation details.
Pinging @elastic/es-search-aggs |
} | ||
|
||
@Override | ||
public XContentBuilder innerToXContent(XContentBuilder builder, Params params) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For normalized DCG, the REST output this adds to the response for each rated request is e.g.:
{
"dcg": 0.55,
"unrated_docs": 3
}
for non-normalized dcg and something like this for the normalized variant:
{
"dcg": 0.69,
"ideal_dcg": 0.60,
"normalized_dcg": 1.14,
"unrated_docs": 2
}
While this isn't super helpful for plain dcg (the metric value is already reported elsewhere, but the number of unrated documents might be interesting to users or for display in a UI), the IDCG and normalization is somewhat interesting I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cbuescher, the details are useful. I am just wondering if we should report duplicate details (dcg
in non-normalized version, and normalized_dcg
in normalized version), since they are already reported. But I will leave the decision to you as the main architect of ranking evalution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just wondering if we should report duplicate details (dcg in non-normalized version, and normalized_dcg in normalized version)
I was wondering the same actually, but then things like parsing the metric details on the client side suddenly gets much more complext, because in order to re-create the details object we would have to somehow detect which variant we currently parse, and if e.g. the "dcg" value was left out here, we'd have to reach out to the metrics score field which is parsed on another level, so that gets kind of ugly. I also don't like the redundancy very much but this way the objects stays kind of self contained. That said I'll give it another thought...
For reference: https://en.wikipedia.org/wiki/Discounted_cumulative_gain explains some of the calculations that appear in this metric. |
} | ||
|
||
@Override | ||
public XContentBuilder innerToXContent(XContentBuilder builder, Params params) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cbuescher, the details are useful. I am just wondering if we should report duplicate details (dcg
in non-normalized version, and normalized_dcg
in normalized version), since they are already reported. But I will leave the decision to you as the main architect of ranking evalution.
While the other two ranking evaluation metrics (precicion and reciprocal rank) already provide a more detailed output for how their score is calculated, the discounted cumulative gain metric (dcg) and its normalized variant are lacking this until now. Its not really clear which level of detail might be useful for debugging and understanding the final metric calculation, but this change adds a `metric_details` section to REST output that contains some information about the evaluation details.
* master: Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360) Rankeval: Fold template test project into main module (#31203) Add QA project and fixture based test for discovery-ec2 plugin (#31107) [Docs] Remove reference to repository-s3 plugin creating an S3 bucket (#31359) REST Client: NodeSelector for node attributes (#31296) LLClient: Fix assertion on windows Add details section for dcg ranking metric (#31177) [ML] Re-enable tests muted in #30982
* 6.x: Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360) [ML] Implement new rules design (#31110) (#31294) Remove RestGetAllAliasesAction (#31308) CCS: don't proxy requests for already connected node (#31273) Rankeval: Fold template test project into main module (#31203) [Docs] Remove reference to repository-s3 plugin creating an S3 bucket (#31359) More detailed tracing when writing metadata (#31319) Add details section for dcg ranking metric (#31177)
While the other two ranking evaluation metrics (precicion and reciprocal rank)
already provide a more detailed output for how their score is calculated, the
discounted cumulative gain metric (dcg) and its normalized variant are lacking
this until now. Its not totally clear which level of detail might be useful for
debugging and understanding the final metric calculation, but this change adds a
metric_details
section to REST output that contains some information about theevaluation details (like number of unlabeled docs, normalization factor etc).