Add multilabel classification metrics #1408

JoelNiklaus · 2023-03-12T20:36:24Z

This is a WIP for adding population level classification metrics for multilabel text classification tasks. Any Feedback is welcome, especially for testing it.

… multilabel datasets

src/helm/benchmark/metrics/classification_metrics.py

yifanmai · 2023-03-14T22:14:11Z

src/helm/benchmark/adaptation/adapters/in_context_learning_adapter.py

@@ -218,10 +218,15 @@ def construct_example_prompt(self, instance: Instance, include_output: bool, ref

        # References (optionally) and output
        output: str
+
+        delimiter = ","


Might want to try with and without space

yifanmai · 2023-03-14T22:15:35Z

src/helm/benchmark/adaptation/adapters/in_context_learning_adapter.py

+            if not correct_references:
+                output = "n/a"
+            else:
+                output = delimiter.join([correct_reference.output.text for correct_reference in correct_references])


Note to self: This might change instances if we have scenarios with multiple correct references somehow (which should not happen).

src/helm/benchmark/metrics/classification_metrics.py

src/helm/benchmark/run_specs.py

yifanmai

Also modify test_clsasification_metrics.py; the existing tests should check if the single-label case continues to work.

… for NER tasks

… sklearn function lists

…gressive for classes

JoelNiklaus · 2023-03-21T18:52:18Z

I updated the PR. @yifanmai would you mind taking a look?

yifanmai

Looks mostly good, we just need to make sure we don't accidentally split predictions in the single-task classification case.

src/helm/benchmark/metrics/classification_metrics.py

JoelNiklaus · 2023-03-21T23:12:12Z

Thank you so much for the review! I addressed the changes.

yifanmai · 2023-03-22T03:02:42Z

Looks good. Thanks!

JoelNiklaus added 5 commits March 11, 2023 10:01

fixed typo and added classification metrics to lextreme and lexglue

17a6f41

added first try at adding population level classification metrics for…

a4980af

… multilabel datasets

adapted schema.yaml to add classification metrics

0c758aa

added logic for dealing with "No Label" class

19a0500

reformatted code

4413e77

yifanmai reviewed Mar 14, 2023

View reviewed changes

JoelNiklaus added 6 commits March 15, 2023 17:03

removed "No Label"

4d42c85

made metrics dependent on task type to exclude classification metrics…

45690d0

… for NER tasks

removed "No Label" from classification metrics and made the inputs to…

57ff3ba

… sklearn function lists

replaced normalize text with strip because normalize text is a bit ag…

3394ae4

…gressive for classes

added MultiLabelBinarizer to make the test pass

1bbabd2

ran black src scripts

204641f

yifanmai requested changes Mar 21, 2023

View reviewed changes

JoelNiklaus added 2 commits March 21, 2023 15:53

bring back normalize_text() and adapt test cases

17ce853

do not split predictions in the case of single label tasks

3668534

yifanmai approved these changes Mar 22, 2023

View reviewed changes

yifanmai merged commit eecea63 into stanford-crfm:main Mar 22, 2023

yifanmai mentioned this pull request Sep 20, 2023

Fix adaptation for multiple references #1785

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multilabel classification metrics #1408

Add multilabel classification metrics #1408

JoelNiklaus commented Mar 12, 2023

yifanmai Mar 14, 2023

yifanmai Mar 14, 2023

yifanmai left a comment

JoelNiklaus commented Mar 21, 2023

yifanmai left a comment

JoelNiklaus commented Mar 21, 2023

yifanmai commented Mar 22, 2023

Add multilabel classification metrics #1408

Add multilabel classification metrics #1408

Conversation

JoelNiklaus commented Mar 12, 2023

yifanmai Mar 14, 2023

Choose a reason for hiding this comment

yifanmai Mar 14, 2023

Choose a reason for hiding this comment

yifanmai left a comment

Choose a reason for hiding this comment

JoelNiklaus commented Mar 21, 2023

yifanmai left a comment

Choose a reason for hiding this comment

JoelNiklaus commented Mar 21, 2023

yifanmai commented Mar 22, 2023