-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix graphgym
activation function dictionary, now not including instances anymore
#4978
Conversation
This reverts commit f48bc2c.
Interestingly, now the test |
Thanks @fjulian, let's fix/adjust the test. |
My pleasure. Just fixed them. |
Codecov Report
@@ Coverage Diff @@
## master #4978 +/- ##
==========================================
- Coverage 82.74% 82.73% -0.02%
==========================================
Files 330 330
Lines 17924 17938 +14
==========================================
+ Hits 14832 14841 +9
- Misses 3092 3097 +5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Until now, the
act_dict
used by thegraphgym
part of this library contained instances of activation functions that were used to build models. This is okay for the activation functions that do not have parameters. However, in the case of PReLU, there is a parameter. This lead to the problem that every PReLU which is built using thegraphgym
infrastructure shared a single parameter, e.g. when PReLUs are used in multiple layers, or when PReLUs are used in several models that are trained alternatingly. The shared parameter caused unexpected changes to a network that was thought to be frozen while another network was trained.The solution to this problem proposed in this PR is to include activation function constructors in the
act_dict
instead of instances, and create a new instance every time an activation function is used in a model.