Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No longer needlessly deepcopy the original model state #201

Merged
merged 2 commits into from
Dec 12, 2022

Conversation

tomaarsen
Copy link
Member

Resolves #199

Hello!

Pull request overview

Details

As mentioned by @Yongtae723 in #199, the model state is copied and then never used again, with exception of scripts/setfit/run_fewshot_multilabel.py. This is an unnecessary waste of memory, and hence we believe that these lines should be removed.

Furthermore, I modified scripts/setfit/run_fewshot_multilabel.py to perform the deepcopying in that script itself. Note that this is somewhat different than the original situation, as now these two lines are executed before the deepcopying is performed, rather than the deepcopy occurring first:

if add_normalization_layer:
self.model._modules["2"] = models.Normalize()

I suspect that this won't matter too much, but I'll leave that up to your judgement.
The script still works like before, e.g. using:

python scripts/setfit/run_fewshot_multilabel.py --sample_sizes 64 --datasets=go_emotions
  • Tom Aarsen

@Yongtae723
Copy link
Contributor

@tomaarsen
Hi Tom!

thank you for making excellent modifications and PR!

@tomaarsen
Copy link
Member Author

And thanks to you for spotting this issue!

Copy link
Contributor

@blakechi blakechi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Nice work!

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @tomaarsen - this looks great 🔥 !

I agree this is a "bug" we should have removed before the release, but great to finally solve it now :)

@lewtun lewtun merged commit 46fcd9f into huggingface:main Dec 12, 2022
@tomaarsen tomaarsen deleted the perf/no_copy branch December 12, 2022 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question : why is model_original_state necessary?
4 participants