Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix graphgym activation function dictionary, now not including instances anymore #4978

Merged
merged 6 commits into from
Jul 13, 2022

Conversation

fjulian
Copy link
Contributor

@fjulian fjulian commented Jul 13, 2022

Until now, the act_dict used by the graphgym part of this library contained instances of activation functions that were used to build models. This is okay for the activation functions that do not have parameters. However, in the case of PReLU, there is a parameter. This lead to the problem that every PReLU which is built using the graphgym infrastructure shared a single parameter, e.g. when PReLUs are used in multiple layers, or when PReLUs are used in several models that are trained alternatingly. The shared parameter caused unexpected changes to a network that was thought to be frozen while another network was trained.

The solution to this problem proposed in this PR is to include activation function constructors in the act_dict instead of instances, and create a new instance every time an activation function is used in a model.

@fjulian fjulian marked this pull request as draft July 13, 2022 15:03
@fjulian
Copy link
Contributor Author

fjulian commented Jul 13, 2022

Interestingly, now the test test_run_single_graphgym in test/graphgym/test_graphgym.py fails, because the number of parameters of a specified model (23883) does not correspond to the number expected by the test (23880) anymore. I assume this is because the model contains 4 PReLUs, which have now separate parameters with my proposed changes.
So either it was desired that the model's PReLUs share their weights, in which case we would need to come up with a more flexible solution than what I'm proposing here, or it was not desired and my changes fix the bug. In the latter case, the test can be fixed by updating the magic number I mentioned above. In either case, I wonder how this number was obtained in the first place.

@fjulian fjulian marked this pull request as ready for review July 13, 2022 15:54
@rusty1s
Copy link
Member

rusty1s commented Jul 13, 2022

Thanks @fjulian, let's fix/adjust the test.

@fjulian
Copy link
Contributor Author

fjulian commented Jul 13, 2022

My pleasure. Just fixed them.

@codecov
Copy link

codecov bot commented Jul 13, 2022

Codecov Report

Merging #4978 (022edec) into master (9b129b8) will decrease coverage by 0.01%.
The diff coverage is 77.27%.

@@            Coverage Diff             @@
##           master    #4978      +/-   ##
==========================================
- Coverage   82.74%   82.73%   -0.02%     
==========================================
  Files         330      330              
  Lines       17924    17938      +14     
==========================================
+ Hits        14832    14841       +9     
- Misses       3092     3097       +5     
Impacted Files Coverage Δ
torch_geometric/graphgym/models/act.py 80.00% <76.19%> (-20.00%) ⬇️
torch_geometric/graphgym/models/layer.py 73.11% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b129b8...022edec. Read the comment docs.

Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rusty1s rusty1s merged commit 9eaa8c0 into pyg-team:master Jul 13, 2022
@fjulian fjulian deleted the fix_act_dict branch August 15, 2022 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants