Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Use of Joint and GDumb with Pre-Trained Models #362

Merged
merged 6 commits into from
Aug 4, 2023

Conversation

wistuba
Copy link
Contributor

@wistuba wistuba commented Aug 4, 2023

Joint and Gdumb reset the models on_model_update_start. This results in a non-desired behavior when working with pre-trained models.
There is also no option to not reset the model at all.

The change introduces a new flag reset which allows to control whether the model will be reset. Furthermore, in case of a reset, the pre-trained model will be reloaded instead of using an untrained model.

The integration tests are expected to break for Joint and GDumb due to a different initialization of the model. Before, the workflow was creating the model and resetting it. Now, it only creates the model without reset.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Contributor

@prabhuteja12 prabhuteja12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting an alternative to this approach:

The resetting is method + network specific. Thus an alternative is to define a reset_parameters for RenateBenchmarkingModule and overload them only for specific modules (ViT, Text transformers etc). The current on_model_training_start would invoke the this reset_parameters to reset it.

@github-actions
Copy link

github-actions bot commented Aug 4, 2023

Coverage report

The coverage rate went from 85.68% to 84.95% ⬇️

0% of new lines are covered.

Diff Coverage details (click to unfold)

src/renate/cli/parsing_functions.py

0% of new lines are covered (79.2% of the complete file).
Missing lines: 457, 462

@wistuba wistuba merged commit c0612c0 into dev Aug 4, 2023
@wistuba wistuba deleted the mw-joint-no-reset branch August 4, 2023 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants