Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mean row aggregation to HELM summarize #2997

Merged
merged 25 commits into from
Sep 25, 2024
Merged

Conversation

farzaank
Copy link
Contributor

@farzaank farzaank commented Sep 17, 2024

Adds the option for users to do mean metric aggregation as opposed to mean win rate within helm summarize.

In addition it allows for mutliple types of aggregation in parallel (as opposed to just one).

@farzaank farzaank marked this pull request as ready for review September 24, 2024 05:46
@farzaank farzaank requested a review from yifanmai September 24, 2024 17:09
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
src/helm/benchmark/presentation/summarize.py Outdated Show resolved Hide resolved
@@ -119,6 +119,9 @@ class MetricGroup(Field):
hide_win_rates: Optional[bool] = None
"""If set to true, do not compute win rates."""

aggregation_strategies: Optional[List[str]] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@percyliang could you review this change to the schema?

@yifanmai yifanmai requested a review from percyliang September 24, 2024 20:35
@farzaank farzaank requested a review from yifanmai September 24, 2024 22:03
Copy link
Collaborator

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

if strategy == "win_rate":
WIN_RATE_AGGREGATION = "mean"
win_rates = compute_aggregate_row_win_rates(table, aggregation=WIN_RATE_AGGREGATION)
description = "How many models this model outperform on average (over columns)."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: outperform -> outperforms (fix pre-existing typo)

@yifanmai
Copy link
Collaborator

Let's merge this first and have @percyliang review post-merge.

@farzaank farzaank merged commit fce1e5f into main Sep 25, 2024
6 checks passed
@farzaank farzaank deleted the farzaan/aggregate-avg branch September 25, 2024 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants