-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance tuning the sampling primitive for multi-node multi-GPU systems. #3169
Performance tuning the sampling primitive for multi-node multi-GPU systems. #3169
Conversation
seunghwak
commented
Jan 23, 2023
•
edited
Loading
edited
- Update groupby code in multi-GPU communication to use atomics based partitioning instead of sort based partitioning (with atomics performance updates in recent NVIDIA GPUs, now the atomics based approach is significantly faster than the sorting based approach if the number of groups is not excessive).
- In random index generation, add an additional code to handle high-degree vertices with with_replacement = false.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #3169 +/- ##
===============================================
Coverage ? 56.26%
===============================================
Files ? 153
Lines ? 9658
Branches ? 0
===============================================
Hits ? 5434
Misses ? 4224
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
…ample_prim_perf_draco
…ample_prim_perf_draco
Removing |
…ample_prim_perf_draco
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(approving for python-codeowners to unblock merging, but didn't review any code here - assuming python-codeowners added from a file change no longer in this PR)
Yes, this is due to branch re-targeting. |
/merge |