-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Add CMake option for per-thread default stream #4995
[REVIEW] Add CMake option for per-thread default stream #4995
Conversation
Can one of the admins verify this patch? |
The code looks fine to enable/disable the feature. I'm mostly concerned about synchronization with cnmem. |
CNMeM does extra synchronization in PTDS mode: https://github.com/NVIDIA/cnmem/blob/master/src/cnmem.cpp#L386 So far in my testing I haven't found any problems. Once this is merged we can probably switch it on in our CI and test more thoroughly. |
Add to whitelist |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Codecov Report
@@ Coverage Diff @@
## branch-0.14 #4995 +/- ##
============================================
Coverage 88.44% 88.44%
============================================
Files 54 54
Lines 10267 10267
============================================
Hits 9081 9081
Misses 1186 1186 Continue to review full report at Codecov.
|
@rongou need a Java codeowners review... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay in reviewing. The changes here look OK, but they are disconnected from the cpp build. I don't think we ever want to build the cpp with default stream and the JNI code without it or vice-versa. If there's no good way to ensure the Java CMake settings are linked to the cpp build automatically then minimally we need to update java/README.md with a section on per-thread default stream. If PER_THREAD_DEFAULT_STREAM was specified during the native build then it also needs to be specified for the Java build (and show how to do so).
@jlowe I couldn't figure out a way to link cmake options, so added a section to the readme. Hopefully this is only temporary, we may want to make PTDS the default. |
This change adds the option to build cuDF with
--default-stream per-thread
enabled.For code not built with
nvcc
, we pass-DCUDA_API_PER_THREAD_DEFAULT_STREAM
to gcc.This is the alternative solution to rapidsai/rmm#352.
By default the option is disabled. To enable it:
I've tested this manually on some spark jobs. For TPCH Q4, there is about a 25% speedup in my setup (YMMV).
The corresponding change in RMM is rapidsai/rmm#354.
@harrism @jrhemstad @revans2 @jlowe