Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heterogeneous renumbering implementation #4602

Merged

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Aug 8, 2024

This PR implements heterogeneous renumbering for GNN.

In addition,

  • Update the existing (homogeneous) sampling post processing function test file extension from .cu to .cpp.
  • Remove the unused renumber_sampled_edgelist function (breaking because this function is removed)
  • Add a stride_fill utility function (thrust wrapper)
  • Add test utility functions to generate edge types & IDs.

Closes #4412

@seunghwak seunghwak requested review from a team as code owners August 8, 2024 00:38
@seunghwak seunghwak added feature request New feature or request non-breaking Non-breaking change and removed cuGraph CMake labels Aug 8, 2024
@seunghwak seunghwak added this to the 24.10 milestone Aug 8, 2024
@github-actions github-actions bot added the CMake label Aug 8, 2024
@seunghwak seunghwak added breaking Breaking change and removed non-breaking Non-breaking change labels Aug 8, 2024
Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things. Looks good.

@@ -576,7 +582,8 @@ if(BUILD_CUGRAPH_MG_TESTS)
###############################################################################################
# - MG BETWEENNESS CENTRALITY tests -----------------------------------------------------------
ConfigureTestMG(MG_BETWEENNESS_CENTRALITY_TEST centrality/mg_betweenness_centrality_test.cpp)
ConfigureTestMG(MG_EDGE_BETWEENNESS_CENTRALITY_TEST centrality/mg_edge_betweenness_centrality_test.cpp)
ConfigureTestMG(MG_EDGE_BETWEENNESS_CENTRALITY_TEST
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an odd edit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exceeded the line length limit :-) Should we allow long lines in CMakeLists.txt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize we had one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't have one officially, but I implicitly consider the end of ############################################################################################### as the line limit.

@@ -0,0 +1,828 @@
/*
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2024?

@@ -510,6 +514,11 @@ renumber_and_sort_sampled_edgelist(
* true.
* 2. Edges in each label are sorted independently if @p edgelist_label_offsets.has_value() is true.
*
* This function assumes that there is a single edge source vertex type and a single edge
* destination vertex type for each edge. If @p edgelist_edge_types.has_value() is false (i.e. there
* is only one edge type), there shouldb be only one edge source vertex type and only one edge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here

Suggested change
* is only one edge type), there shouldb be only one edge source vertex type and only one edge
* is only one edge type), there should be only one edge source vertex type and only one edge

Copy link
Member

@alexbarghi-nv alexbarghi-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I assume you're delegating updating the C API sampling function to properly handle heterogeneous renumbering to me.

"Invalid input arguments: num_labels should be 0 if the input edge list is empty.");
CUGRAPH_EXPECTS(
"num_hops == 0",
num_hops == 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixed a bug in the C++ layer (the old check always passed because of the quotes), but adding this check causes a failure in a python test.

Is there a reason to fail with an exception if num_hops is 0? If the input edge list is empty, couldn't we just return an empty result? I'm unfamiliar with the logic.

This is failing in CI, we should either modify the python code or remove this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I guess you're right. We can check only for nun_labels == 0 for an empty edge list and allow num_hops & num_vertex_types to have a non zero value.

@ChuckHastings
Copy link
Collaborator

Looks good. I assume you're delegating updating the C API sampling function to properly handle heterogeneous renumbering to me.

Do you have time to do that @alexbarghi-nv ? We can do that as a separate PR if you are busy with other things. Would you want to always use this new heterogeneous renumbering logic, add an option for it, or detect it in some other way?

@alexbarghi-nv
Copy link
Member

Looks good. I assume you're delegating updating the C API sampling function to properly handle heterogeneous renumbering to me.

Do you have time to do that @alexbarghi-nv ? We can do that as a separate PR if you are busy with other things. Would you want to always use this new heterogeneous renumbering logic, add an option for it, or detect it in some other way?

We only want to use it for heterogeneous graphs. I think we can just add an optional parameter for the edge type mappings, and if they are present, then the C API sampling function will call the heterogeneous renumbering with the mappings instead of the homogeneous renumbering.

@alexbarghi-nv
Copy link
Member

@seunghwak it looks like your PR changed the way we're handling empty start lists which is causing CI to fail:

FAILED tests/sampling/test_uniform_neighbor_sample.py::test_uniform_neighbor_sample_simple[graph_file:email_Eu_core-directed:True-with_replacement:False-indices_type:int32] - RuntimeError: non-success value returned from cugraph_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/opt/conda/conda-bld/work/cpp/src/sampling/sampling_post_processing_impl.cuh line=235: Invalid input arguments: num_labels should be 0 if the input edge list is empty.

@seunghwak
Copy link
Contributor Author

@seunghwak it looks like your PR changed the way we're handling empty start lists which is causing CI to fail:

FAILED tests/sampling/test_uniform_neighbor_sample.py::test_uniform_neighbor_sample_simple[graph_file:email_Eu_core-directed:True-with_replacement:False-indices_type:int32] - RuntimeError: non-success value returned from cugraph_uniform_neighbor_sample: CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/opt/conda/conda-bld/work/cpp/src/sampling/sampling_post_processing_impl.cuh line=235: Invalid input arguments: num_labels should be 0 if the input edge list is empty.

@ChuckHastings I assume this is also due to the fix in "num_hops == 0," right?

@ChuckHastings
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit 338e5e0 into rapidsai:branch-24.10 Sep 3, 2024
104 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake cuGraph feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Heterogeneous Renumbering in the C++ API
3 participants