Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

Add gather/scatter support 1D tensor #229

Merged
merged 3 commits into from
Nov 22, 2024

Conversation

chang-l
Copy link
Contributor

@chang-l chang-l commented Oct 17, 2024

This PR is to add gather/scatter support 1D tensor on python level, as WholeGraph should support basic indexing operations for both 1D (array) and 2D (matrix) wholememory tensors. Without this PR, if with 1D wholememory tensor, gather/scatter op does not work, e.g.,

To test, run

pytest --cache-clear  --import-mode=append  tests/wholegraph_torch/ops/test_wholegraph_gather_scatter.py -s

Remaining issue:

On my local test with single GPU, the test can pass.
For multiGPU setup, gather op works fine, but 1D scatter seems not working as it would crash at:

with incorrect scatter outputs: Indices where allclose fails: tensor([0., 0., 0., ..., 0., 0., 0.]) tensor([ 1435., 1439., 1443., ..., 257703., 257707., 257711.])

@linhu-nv Can you please take a look? Does scatter_op suppose to work with 1D wholememory tensor?

Copy link

copy-pr-bot bot commented Oct 17, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@linhu-nv
Copy link
Contributor

Thanks for bring this up. Yes, scatter_op supposes to work with 1D wholememory tensor. I will try to find out why it does't work.

Copy link
Contributor

@linhu-nv linhu-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the scatter_test issue has been addressed, this PR looks good to me.

@BradReesWork BradReesWork added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Nov 21, 2024
@BradReesWork
Copy link
Member

/okay to test

@BradReesWork
Copy link
Member

/okay to test

rapids-bot bot pushed a commit that referenced this pull request Nov 21, 2024
This is to address the issue found by: #229

Authors:
  - Chang Liu (https://github.com/chang-l)

Approvers:
  - https://github.com/linhu-nv
  - Brad Rees (https://github.com/BradReesWork)

URL: #235
@BradReesWork
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 9a2bb57 into rapidsai:branch-24.12 Nov 22, 2024
46 checks passed
rapids-bot bot pushed a commit to rapidsai/cugraph-gnn that referenced this pull request Nov 22, 2024
Migrated from:  rapidsai/wholegraph#229 



This PR is to add gather/scatter support 1D tensor on python level, as WholeGraph should support basic indexing operations for both 1D (array) and 2D (matrix) wholememory tensors.   Without this PR, if with 1D wholememory tensor, gather/scatter op does not work, e.g., https://github.com/rapidsai/wholegraph/blob/0efba33835d6e4e104b5d7101a91e0ea55a6ca53/python/pylibwholegraph/pylibwholegraph/torch/tensor.py#L89



To test, run 
```
pytest --cache-clear  --import-mode=append  tests/wholegraph_torch/ops/test_wholegraph_gather_scatter.py -s
```

**Remaining issue:**

On my local test with single GPU, the test can pass.   
For multiGPU setup, gather op works fine, but 1D scatter seems not working as it would crash at:
https://github.com/rapidsai/wholegraph/blob/2e963b98aa6027c300d60e839010d3dd8ca422eb/python/pylibwholegraph/pylibwholegraph/tests/wholegraph_torch/ops/test_wholegraph_gather_scatter.py#L108 with incorrect scatter outputs: `Indices where allclose fails:  tensor([0., 0., 0.,  ..., 0., 0., 0.]) tensor([  1435.,   1439.,   1443.,  ..., 257703., 257707., 257711.]) `

This would work if this bugfix is merged: #73

cc. @linhu-nv

Authors:
  - Chang Liu (https://github.com/chang-l)

Approvers:
  - https://github.com/linhu-nv
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: #74
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants