Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the conversion of scipy sparse to torch sparse. Wait until pyg … #276

Merged
merged 17 commits into from
Apr 4, 2023

Conversation

DomInvivo
Copy link
Collaborator

@DomInvivo DomInvivo commented Apr 1, 2023

Wait for Pyg >= 2.4.0 being released released.

Checklist:

  • Simplified the logic of the featurizer. Instead of matching the ordering between the adjacency matrix and the bonds, the adjacency is not generated directly from the bond orders.
  • Improved unit-tests to verify that the ordering of the node, edge features and bond ordering is correct.
  • Don't wait for the PR on torch_geometric since torch-sparse is too slow anyways, with limited memory benefits.
  • Added a news entry: copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

@DomInvivo DomInvivo marked this pull request as ready for review April 3, 2023 19:33
@DomInvivo
Copy link
Collaborator Author

DomInvivo commented Apr 3, 2023

@callumm-graphcore @joao-alex-cunha this PR changes the featurizer to return torch.Tensor instead of scipy.sparse.coo_matrix. I tried using torch.sparse_coo_tensor, but it made the featurization 5x slower on 32 CPUs.

This might help the dataloader, since it won't have to convert any numpy -> torch, nor any sparse -> dense.
Let me know if you see any speed/memory difference.

@DomInvivo
Copy link
Collaborator Author

The new version uses torch.Tensor instead of coo_matrix to store the features. The dataloading is ~10% faster since there's less conversion to do. I also tried torch.sparse_coo_tensor, but it was extremely slow.

In the image below, we see the difference between Tensor, coo_matrix, and Tensors while batching everything in a single graph with from_data_list.
image

@DomInvivo DomInvivo merged commit 6654778 into master Apr 4, 2023
@hadim hadim deleted the torch_sparse branch June 6, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant