-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing graphGPS as a global attention wrapper around Hydra MPNNs #315
Conversation
# Test only models | ||
@pytest.mark.parametrize( | ||
"global_attn_engine", | ||
["GPS"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArCho48
We need to try also "None" as global attention mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. :) My apologies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. No probs :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArCho48
The tests need to check the MPNNs both with and without global attention mechanisms.
@@ -101,11 +102,19 @@ def __init__(self, dirpath, config, dist=False, sampling=None): | |||
|
|||
rx = list(nsplit((dirfiles), self.world_size))[self.rank] | |||
|
|||
# LPE | |||
self.transform = AddLaplacianEigenvectorPE( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Laplacian Matrix used to calculate the eigenvectors is D-A
, right? Where D is the Degree matrix and A is the Adjacency Matrix? So, although this is a positional encoding, it is based solely on connectivity, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, lapPE is a purely topological encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bdw, @RylieWeaver we couldnt get GPS to run on LennardJones as pytorch throws an error when backpropagating manually computed gradients (which I think is done for LJ) through the attention layers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bdw, @RylieWeaver we couldnt get GPS to run on LennardJones as pytorch throws an error when backpropagating manually computed gradients (which I think is done for LJ) through the attention layers
Yes, LJ uses compute_grad_energy
to compute forces by the gradient of energy wrt position. It is also a general functionality which could be used for other force/energy prediction examples.
As a first guess, I believe the error comes from a lack of gradient flow to the position from the Laplacian Eigenvectors.
If this is the case, a potentially easy fix might be to do the Laplacian computation with torch.no_grad()
, like this:
with torch.no_grad():
data = self.transform(data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that can be a problem perhaps, but the error is mainly due to the incompatibility of gradient flow with the torch multiheadattention layer. I will send u the exact error shortly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes, please let me know what comes up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArCho48 Is this error still coming up? Have you had the opportunity to make the Laplacian with torch.nograd()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Rylie, I think we decided to skip integrating LJ with GPS in this PR. I can work on this (n add ur fix if i works) in the next PR on gps. @allaffa do u have any suggestion?
# encode node positional embeddings | ||
x = self.pos_emb(data.pe) | ||
# if node features are available, genrate mebeddings, concatenate with positional embeddings and map to hidden dim | ||
if self.input_dim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect input_dim
to ever not be defined? Is this a new type of functionality in Hydra where there are solely graph features and no node features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graph features and/or even edge features but no node features can be a possibility.
hydragnn/models/CGCNNStack.py
Outdated
@@ -53,10 +58,12 @@ def __init__( | |||
assert self.input_args == "inv_node_feat, equiv_node_feat, edge_index" | |||
assert self.conv_args == "inv_node_feat, edge_index" | |||
|
|||
def get_conv(self, input_dim, _): | |||
def get_conv(self, input_dim, _, edge_dim=None): | |||
if not edge_dim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, but it may be better to instead handle this inside __init__()
, by adding a line in the section if self.global_attn_engine:
that sets self.edge_dim
with:
self.another_dim = self.embed_dim = self.edge_embed_dim = hidden_dim # ensure that all input to gps have the same dimensionality
or something like it. This would get rid of the if/else in every stack's get_conv
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot change self.edge_dim between lines 134-152 of base bcoz I need the original edge_dim in line 167. But I think I can get rid of the if/else in get_conv since edge_dim is always passed to get_conv through the edge_embed_dim variable. Thanks for the suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point
Fix SchNet's use of edge_attr
ORNL#315) * gps full * black formatted * updated tests * black formatted * Fix SchNet * additional tests added and pytest input fixed * deepspeed test added * removed edge_dim check * changed copyright year * SchNet threahsold increased for conv_head --------- Co-authored-by: Rylie Weaver <rylieweaver9@gmail.com> Co-authored-by: allaffa <massimiliano.lupo.pasini@gmail.com>
This PR incorporates graphGPS within the HydraGNN framework.