Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ImageClassificationEvaluator #173

Merged
merged 12 commits into from
Jul 5, 2022

Conversation

fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Jul 4, 2022

Let's stall #167 that requires some more work and try to merge this one before instead.

Image classification is a much simplier task, and the pipeline for evaluation works out of the box.

I did some refactoring to avoid copying code, on the suggestion from @ola13 .

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 4, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@ola13 ola13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for looking into it @fxmarty, some minor comments inline :)

src/evaluate/evaluator/__init__.py Outdated Show resolved Hide resolved
src/evaluate/evaluator/__init__.py Show resolved Hide resolved
metric = self.prepare_metric(metric)

references = data[label_column]
predictions = self._compute_predictions(pipe, data[input_column], label_mapping=label_mapping)
Copy link
Contributor

@ola13 ola13 Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given current refactoring it might be more readable to do _compute_predictions inline (same for image_classification - splitting it into a separate function may have been a suboptimal idea from the get go.

Copy link
Contributor Author

@fxmarty fxmarty Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, let me know if the modifs I did in fbf66a8 are fine.

Copy link
Contributor

@ola13 ola13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fxmarty, looks great! Accepting the PR, however, looks like there are some tests failing (problems with imports?), let's make sure these are resolved before merging

@fxmarty fxmarty force-pushed the add-image-classification-evaluator branch from 722a7ec to c88815a Compare July 5, 2022 10:06
@fxmarty
Copy link
Contributor Author

fxmarty commented Jul 5, 2022

@ola13 Should be fine now, we needed a rebase :)

@ola13 ola13 merged commit 27ea232 into huggingface:main Jul 5, 2022
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a question why we need a bigger machine for the tests? Thanks for adding this ❤️

@@ -8,7 +8,7 @@ jobs:
working_directory: ~/evaluate
docker:
- image: cimg/python:3.7
resource_class: medium
resource_class: large
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the large class needed? because of the evaluator/trainer test?

Copy link
Contributor Author

@fxmarty fxmarty Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is because of the evaluator/trainer. I don't understand why, locally running the tests I was using at most 800 MB of RAM for the evaluator/trainer test.

With medium: https://app.circleci.com/pipelines/github/huggingface/evaluate/468/workflows/5b57ecc8-abd1-4fdb-a1fb-db655510fc60/jobs/1409/resources
With large: https://app.circleci.com/pipelines/github/huggingface/evaluate/479/workflows/7f837ed5-bb94-4c60-a26d-9ba0bfae3c93/jobs/1442/resources

The models in the parity tests are 45 MB and 18 MB. The "beans" dataset looks to be <200 MB ( https://huggingface.co/datasets/beans/tree/main/data ). sst2 should be < 10 MB. So I am not sure what the issue is.

The test can be run locally to check memory usage: pytest tests/test_trainer_evaluator_parity.py

@@ -37,7 +37,7 @@ def setUp(self):
)

def tearDown(self):
shutil.rmtree(self.dir_path, onerror=onerror)
shutil.rmtree(self.dir_path, ignore_errors=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that mean that on windows the folder will just no be removed in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, the read-only files will not be removed, typically the .git content. I remember I still had issues with the onerror from https://stackoverflow.com/a/2656405 . Would you prefer to rollback to an error-handler, that makes sure that read-only files are deleted? I agree it is cleaner if anybody runs the tests on Windows.



try:
from transformers import FeatureExtractionMixin, Pipeline, PreTrainedModel, TFPreTrainedModel
Copy link
Member

@lvwerra lvwerra Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the FeatureExtractionMixin was only added in transformers==4.17.0. I think we need to update the setup.py install the right version (the tests were failing for me locally even when installing the evaluator extra).

huggingface/transformers@b5c6fde

Could you confirm @fxmarty?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants