-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ImageClassificationEvaluator
#173
Add ImageClassificationEvaluator
#173
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for looking into it @fxmarty, some minor comments inline :)
metric = self.prepare_metric(metric) | ||
|
||
references = data[label_column] | ||
predictions = self._compute_predictions(pipe, data[input_column], label_mapping=label_mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given current refactoring it might be more readable to do _compute_predictions
inline (same for image_classification
- splitting it into a separate function may have been a suboptimal idea from the get go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, let me know if the modifs I did in fbf66a8 are fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fxmarty, looks great! Accepting the PR, however, looks like there are some tests failing (problems with imports?), let's make sure these are resolved before merging
722a7ec
to
c88815a
Compare
@ola13 Should be fine now, we needed a rebase :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a question why we need a bigger machine for the tests? Thanks for adding this ❤️
@@ -8,7 +8,7 @@ jobs: | |||
working_directory: ~/evaluate | |||
docker: | |||
- image: cimg/python:3.7 | |||
resource_class: medium | |||
resource_class: large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the large
class needed? because of the evaluator/trainer test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is because of the evaluator/trainer. I don't understand why, locally running the tests I was using at most 800 MB of RAM for the evaluator/trainer test.
With medium: https://app.circleci.com/pipelines/github/huggingface/evaluate/468/workflows/5b57ecc8-abd1-4fdb-a1fb-db655510fc60/jobs/1409/resources
With large: https://app.circleci.com/pipelines/github/huggingface/evaluate/479/workflows/7f837ed5-bb94-4c60-a26d-9ba0bfae3c93/jobs/1442/resources
The models in the parity tests are 45 MB and 18 MB. The "beans" dataset looks to be <200 MB ( https://huggingface.co/datasets/beans/tree/main/data ). sst2 should be < 10 MB. So I am not sure what the issue is.
The test can be run locally to check memory usage: pytest tests/test_trainer_evaluator_parity.py
@@ -37,7 +37,7 @@ def setUp(self): | |||
) | |||
|
|||
def tearDown(self): | |||
shutil.rmtree(self.dir_path, onerror=onerror) | |||
shutil.rmtree(self.dir_path, ignore_errors=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does that mean that on windows the folder will just no be removed in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, the read-only files will not be removed, typically the .git
content. I remember I still had issues with the onerror
from https://stackoverflow.com/a/2656405 . Would you prefer to rollback to an error-handler, that makes sure that read-only files are deleted? I agree it is cleaner if anybody runs the tests on Windows.
|
||
|
||
try: | ||
from transformers import FeatureExtractionMixin, Pipeline, PreTrainedModel, TFPreTrainedModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the FeatureExtractionMixin
was only added in transformers==4.17.0
. I think we need to update the setup.py
install the right version (the tests were failing for me locally even when installing the evaluator
extra).
huggingface/transformers@b5c6fde
Could you confirm @fxmarty?
Let's stall #167 that requires some more work and try to merge this one before instead.
Image classification is a much simplier task, and the pipeline for evaluation works out of the box.
I did some refactoring to avoid copying code, on the suggestion from @ola13 .