Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling according to max_sample within AddMetaPaths #4750

Merged
merged 19 commits into from
Jun 3, 2022
Merged

Conversation

RexYing
Copy link
Contributor

@RexYing RexYing commented Jun 1, 2022

This functionality is used in cases where certain metapath edge types result in very dense adjacency.
If max_sample is set, in expectation this number of samples will be sampled per node for each edge index.

@RexYing RexYing requested review from rusty1s and zechengz June 1, 2022 23:22
@codecov
Copy link

codecov bot commented Jun 1, 2022

Codecov Report

Merging #4750 (b6546fd) into master (09f25e9) will decrease coverage by 1.89%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4750      +/-   ##
==========================================
- Coverage   84.37%   82.48%   -1.90%     
==========================================
  Files         324      324              
  Lines       17304    17313       +9     
==========================================
- Hits        14601    14281     -320     
- Misses       2703     3032     +329     
Impacted Files Coverage Δ
torch_geometric/transforms/add_metapaths.py 100.00% <100.00%> (ø)
torch_geometric/nn/models/dimenet_utils.py 0.00% <0.00%> (-75.52%) ⬇️
torch_geometric/nn/models/dimenet.py 14.51% <0.00%> (-53.00%) ⬇️
torch_geometric/nn/conv/utils/typing.py 81.25% <0.00%> (-17.50%) ⬇️
torch_geometric/nn/resolver.py 86.36% <0.00%> (-9.10%) ⬇️
torch_geometric/nn/inits.py 67.85% <0.00%> (-7.15%) ⬇️
torch_geometric/io/tu.py 93.90% <0.00%> (-2.44%) ⬇️
torch_geometric/nn/models/mlp.py 98.52% <0.00%> (-1.48%) ⬇️
torch_geometric/transforms/gdc.py 78.17% <0.00%> (-1.02%) ⬇️
torch_geometric/data/dataset.py 96.80% <0.00%> (-0.80%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09f25e9...b6546fd. Read the comment docs.

@rusty1s rusty1s changed the title Sampling according to max_sample for adding metapaths Sampling according to max_sample within AddMetaPaths Jun 2, 2022
Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RexYing. This mostly looks good. I think edge_index_sampling should leverage SparseTensor to make this a little bit faster and to ensure that we are always operating on sorted inputs.

torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
torch_geometric/transforms/add_metapaths.py Outdated Show resolved Hide resolved
@rusty1s
Copy link
Member

rusty1s commented Jun 3, 2022

I think I fixed the runtime issues. One reason for this was that it is quite expensive to sample in both directions. I don't think this is really needed, and I moved to a solution where we only sample adj1 after performing the matrix multiplication. Merging this now but happy to extend/improve/make faster in later PRs.

@rusty1s rusty1s merged commit 8bd9ae4 into master Jun 3, 2022
@rusty1s rusty1s deleted the metapath_sampling branch June 3, 2022 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants