Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix auto-merger #2771

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Oct 4, 2022

This PR fixes auto-merger: #2744

cjnolet and others added 24 commits September 27, 2022 16:38
Closes rapidsai#2678 

raft updated some k-means logic and dramatically increased the compile time of our spectral clustering implementation.  This PR adds an include that will suppress the expansion of templates that we are not using.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Ray Douglass (https://github.com/raydouglass)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2739
Add a primitive to support randomly selecting (without biases) vertex neighbors.

Also, refactored primitive tests to avoid duplicating code for vertex/edge source/edge destination property value generation.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Naim (https://github.com/naimnv)

URL: rapidsai#2703
This PR enables the PLC `algos` to leverage the `PLC `graph created prior to running the `algos`. This will enable the exclusion of the graph creation time from the algo's run time for more accurate benchmarks. This PR also updates the tests accordingly (by skipping the `cython.cu `renumbering) along with the docstrings.

closes rapidsai#2375
closes rapidsai#2374

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2682
…ling algos instead of copying result data (rapidsai#2684)

## Summary
* (partially) closes rapidsai/GaaS#23
* Added `SamplingResult` cython extension class that corresponds to the C API's `cugraph_sample_result_t` type, which removes a copy operation in sampling algos.
* Added new testing APIs and unit tests to validate correctness and that memory is cleaned up as expected when all refs to results are removed (see `test_neighborhood_sampling.py::test_sample_result`)
* Updated `build.sh` to use `ninja` by default.

## Background
As part of a larger effort to provide [GPU-GPU data transfer in `cugraph_service`](rapidsai/GaaS#23) to improve performance, the need to minimize redundant data copy operations was also something to address.

Prior to this change, PLC algos would copy their results from the C API device data structures into one or more cupy ndarrays, which allowed the algo to then delete the result data owned by the C API and leave the lifecycle management of the results returned to python to the python interpreter/GC.  The copy operation not only slowed down sampling algo calls (slightly) but also caused spikes in GPU memory usage where two copies of results existed in GPU memory for a short time immediately after the copy but before the original was deleted.  

This PR adds a cython extension class that corresponds to the C API's `cugraph_sample_result_t` type, which is used to pass ownership of the sampling results to python.  The extension class calls `cugraph_sample_result_free()` when all references to the result data in python are removed and the garbage collector runs.  The owning `SamplingResult` object allows PLC sampling algos to instead return cupy ndarray "views" for results instead of copying them to new cupy ndarrays.

New test APIs to create sampling result objects in C and PLC were added and used in the new tests to verify that the ownership of the result data was properly transferred to Python, and that the lifecycle management of results was done properly when clients keep references to data (eg. pass results to other parts of code for use long after the algo completes).

A benchmark was also added to verify the (minor) speedup. Runs `0001-2` were before the change, runs `0003-5` (red box) were after:
![image](https://user-images.githubusercontent.com/3039903/189554855-6d24733e-1fe2-4ef5-841b-95085a6a609c.png)

Authors:
  - Rick Ratzel (https://github.com/rlratzel)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Jordan Jacobelli (https://github.com/Ethyling)
  - Brad Rees (https://github.com/BradReesWork)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: rapidsai#2684
This PR fixes rapidsai/graph_dl#27

This PR fixes rapidsai/graph_dl#43

This PR fixes rapidsai/graph_dl#39



**Tests Added:** 

Single GPU:
- [x]  APIs like num_nodes, num_edges
- [x]  test_sampling_basic
- [x]  test_sampling_homogeneous_gs_in_dir
- [x]  test_sampling_homogeneous_gs_out_dir
- [x]  test_sampling_gs_homogeneous_neg_one_fanout
- [x]  test_sampling_gs_heterogeneous_in_dir 
- [x]  test_sampling_gs_heterogeneous_out_dir
- [x]  test_sampling_gs_heterogeneous_neg_one_fanout 



Multi GPU:
- [x] APIs like num_nodes, num_edges
- [x] test_sampling_basic
- [x]  test_sampling_homogeneous_gs_in_dir
- [x]  test_sampling_homogeneous_gs_out_dir 
- [x] test_sampling_gs_homogeneous_neg_one_fanout 
- [x] test_sampling_gs_heterogeneous_in_dir 
- [x] test_sampling_gs_heterogeneous_out_dir
- [x] test_sampling_gs_heterogeneous_neg_one_fanout 


Bugs to reproduce:
- [ ] Repro heterogeneous single gpu hang outside pytest
- [x] Repro hetrogenous multi gpu incorrect results for with_replace=False 
rapidsai#2760
- [x] Repro hetrogenous incorrect results for different amount of GPUs
rapidsai#2761

Tests that depend upon rapidsai#2523 
- [x] Add minimal example to PR to ensure it gets fixed
Added comment here: rapidsai#2523 (review)
- [x]  test_get_node_storage_gs  (Failing cause of a PG bug) 
- [x]  test_get_edge_storage_gs  (Failing cause of a PG bug) 
- [x] test_get_node_storage_gs (Failing cause of a PG bug) 
- [x] test_get_edge_storage_gs (Failing cause of a PG bug)

Authors:
  - Vibhu Jawa (https://github.com/VibhuJawa)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#2592
…mplementation to follow. (rapidsai#2154)

#2109 describes additional use cases that PropertyGraph needs to support.  This PR adds tests to ensure those are supported.

Authors:
  - Rick Ratzel (https://github.com/rlratzel)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#2154
~Currently, this only does SG version for rapidsai#2401.  MG is still TODO.~

Closes rapidsai#2401

This also doesn't change anything user-facing (yet).

Authors:
  - Erik Welch (https://github.com/eriknw)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2523
Fix warnings due to recent raft deprecation and more.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Joseph Nke (https://github.com/jnke2016)

URL: rapidsai#2755
This PR refactors Louvain by leveraging the C API and does minor updates to the python tests
closes rapidsai#2493

Authors:
  - Joseph Nke (https://github.com/jnke2016)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: rapidsai#2705
- [x] C API Changes
- [x] Pylibcugraph API Changes

Closes rapidsai#2577 
Closes rapidsai#2576 
Closes rapidsai#2579 

<s>Note: this adds a UINT8 type which is the preferred data type for edge types.  The upcoming updates to the C++ code should only specialize `uint8_t` for `edge_type_t` in order to avoid unnecessary compilation.</s>

- [x] Update to work with `int32_t`

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Seunghwa Kang (https://github.com/seunghwak)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2629
…pidsai#2765)

Resolves a critical bug where the whole start list was sent to every vertex resulting in duplicate samples.
Closes rapidsai#2760

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#2765
This PR adds `raft-dask` & `pylibraft` to `update-version.sh`

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#2763
Closes rapidsai#2399

This is ~not yet~ *now* implemented for MG.

Authors:
  - Erik Welch (https://github.com/eriknw)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2510
…2725)

With rapids-cmake now requiring CMake 3.23.1 update consumers to correctly express this requirement

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Jordan Jacobelli (https://github.com/Ethyling)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2725
Closes rapidsai#2581 
Closes rapidsai#2582
Closes rapidsai#2665

Update the neighborhood sampling algorithm to use the new neighborhood sampling primitive defined in rapidsai#2703

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)

URL: rapidsai#2751
This PR pins `dask` and `distributed` to `2022.9.2` for `22.10` release.

xref: rapidsai/cudf#11822

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2758
Certain algorithms (Katz, HITS, PageRank, Eigenvector centrality) require the flag `store_transposed` to be set to `True` for optimal performance. Although the CAPI internally `transposed` the graph at the algo's call if it wasn't at the graph creation, this adds extra overheads. This PR raises an exception if the user doesn't set the flag to `True` at the graph creation

closes rapidsai#2742

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2756
…2750)

Closes rapidsai#2543 

Implements unweighted similarity algorithms using the new primitive defined in rapidsai#2728.

Weighted implementations will be tracked by issue rapidsai#2749

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Naim (https://github.com/naimnv)
  - Seunghwa Kang (https://github.com/seunghwak)

URL: rapidsai#2750
…2757)

Closes rapidsai#2565

Should we try to avoid creating conflicting edge IDs (for example, when user doesn't provides _all_ edge IDs).  Should we expose `last_edge_id`.

Authors:
  - Erik Welch (https://github.com/eriknw)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#2757
updating api docs for new classes

Authors:
  - Don Acosta (https://github.com/acostadon)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#2754
@galipremsagar galipremsagar requested review from a team as code owners October 4, 2022 17:28
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@galipremsagar galipremsagar added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 4, 2022
@raydouglass raydouglass merged commit de0a714 into rapidsai:branch-22.12 Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.