Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to make nx_cugraph.Graph a drop-in replacement for nx.Graph, adds attrs for auto-dispatch for generators #4558

Draft
wants to merge 7 commits into
base: branch-24.10
Choose a base branch
from

Conversation

rlratzel
Copy link
Contributor

@rlratzel rlratzel commented Jul 27, 2024

TODO:

  • Unit tests
  • Improve graph update methods (add_node(), et.al.)
  • Update remaining graph classes

This updates nx-cugraph Graph and DiGraph classes to inherit from nx.Graph, and adds the appropriate cached_properties to lazily convert and cache to a NetworkX Graph and expose the appropriate dictionaries accordingly. These changes allow a nx_cugraph.Graph instance to be drop-in compatible with networkx functions that are not yet supported by nx_cugraph.

Combine this with the changes to NetworkX in this PR to auto dispatch generators if they return compatible backend types and allow compatible backend types to fallback to networkx, and users can maximize e2e acceleration for their workflows without code changes.

edgelist_csv = "/datasets/cugraph/csv/directed/cit-Patents.csv"
edgelist_df = pd.read_csv(edgelist_csv, sep=" ", names=["src", "dst"], dtype="int32")

with Timer("from_pandas_edgelist"):
    G = nx.from_pandas_edgelist(
        edgelist_df, source="src", target="dst", create_using=nx.DiGraph)

print(type(G))

with Timer("number of nodes and edges"):
    print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")

with Timer("pagerank"):
    pr = nx.pagerank(G)

with Timer("coloring"):
    c1 = nx.coloring.greedy_color(G)

with Timer("coloring (again)"):
    c2 = nx.coloring.greedy_color(G)

with Timer("adding a node"):
    G.add_edge(0, (3.14159, "string_in_tuple"))

print(type(G))
print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")

with Timer("re-running pagerank"):
    pr2 = nx.pagerank(G)

print(f"new vs. orig nodes: {pr2.keys() - pr.keys()}")

with Timer("pad_graph (this mutates the input graph)"):
    cc = nx.coloring.equitable_coloring.pad_graph(G, 11)

print(type(G))
print(f"{G.number_of_nodes()=}, {G.number_of_edges()=}")

with Timer("re-running pagerank"):
    pr3 = nx.pagerank(G)

print(f"new vs. orig nodes: {pr3.keys() - pr.keys()}")

Timer.print_total()

No backends used:

(nx) root@8546eec3d49d:~# python zcc_demo.py

from_pandas_edgelist...
Done in: 0:00:50.219987
<class 'networkx.classes.digraph.DiGraph'>

number of nodes and edges...
G.number_of_nodes()=3774768, G.number_of_edges()=16518948
Done in: 0:00:01.851362

pagerank...
Done in: 0:01:10.388206

coloring...
Done in: 0:00:13.802888

coloring (again)...
Done in: 0:00:13.793485

adding a node...
Done in: 0:00:00.000018
<class 'networkx.classes.digraph.DiGraph'>
G.number_of_nodes()=3774769, G.number_of_edges()=16518949

re-running pagerank...
Done in: 0:01:03.532062
new vs. orig nodes: {(3.14159, 'string_in_tuple')}

pad_graph (this mutates the input graph)...
Done in: 0:00:00.000764
<class 'networkx.classes.digraph.DiGraph'>
G.number_of_nodes()=3774771, G.number_of_edges()=16518950

re-running pagerank...
Done in: 0:01:16.790938
new vs. orig nodes: {(3.14159, 'string_in_tuple'), 3774769, 3774770}
Total time: 0:04:50.379710

nx-cugraph backend used - nx-cugraph does not yet support coloring.greedy_color() or nx.coloring.equitable_coloring.pad_graph(), note the first call to coloring includes the conversion to a networkx Graph, but the second uses the cached conversion:

(nx) root@8546eec3d49d:~# NETWORKX_BACKEND_PRIORITY=cugraph python zcc_demo.py

from_pandas_edgelist...
Done in: 0:00:00.664462
<class 'nx_cugraph.classes.digraph.DiGraph'>

number of nodes and edges...
G.number_of_nodes()=3774768, G.number_of_edges()=16518948
Done in: 0:00:00.000008

pagerank...
Done in: 0:00:03.741143

coloring...
Done in: 0:01:11.706015

coloring (again)...
Done in: 0:00:11.752219

adding a node...
Done in: 0:00:13.415563
<class 'nx_cugraph.classes.digraph.DiGraph'>
G.number_of_nodes()=3774769, G.number_of_edges()=16518949

re-running pagerank...
Done in: 0:00:00.878451
new vs. orig nodes: {(3.14159, 'string_in_tuple')}

pad_graph (this mutates the input graph)...
Done in: 0:00:13.069187
<class 'nx_cugraph.classes.digraph.DiGraph'>
G.number_of_nodes()=3774771, G.number_of_edges()=16518950

re-running pagerank...
Done in: 0:00:00.896314
new vs. orig nodes: {3774769, 3774770, (3.14159, 'string_in_tuple')}
Total time: 0:01:56.123361

Also note, when debug logging is enabled, you can see calls made from within networkx functions being dispatched appropriately:

pad_graph (this mutates the input graph)...
DEBUG:networkx.utils.backends:no backends are available to handle the call to `pad_graph` with graph types {'cugraph'}
DEBUG:networkx.utils.backends:falling back to backend 'networkx' for call to `pad_graph' with args: (<nx_cugraph.classes.digraph.DiGraph object at 0x7efb84138d60>, 11), kwargs: {}
DEBUG:networkx.utils.backends:using backend 'cugraph' for call to `complete_graph' with args: (2, None), kwargs: {}
DEBUG:networkx.utils.backends:using backend 'cugraph' for call to `relabel_nodes' with args: (<nx_cugraph.classes.graph.Graph object at 0x7efb84139c60>, {0: 3774769, 1: 3774770}, True), kwargs: {}
Done in: 0:00:13.226258

zcc_demo.py.txt

@rlratzel rlratzel added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 27, 2024
@github-actions github-actions bot removed the conda label Jul 31, 2024
@rlratzel rlratzel changed the title Updates to make nx_cugraph.Graph a subclass of nx.Graph, adds attrs for auto-dispatch for generators Updates to make nx_cugraph.Graph a drop-in replacement for nx.Graph, adds attrs for auto-dispatch for generators Jul 31, 2024
@rlratzel rlratzel added this to the 24.10 milestone Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant