-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][PI][CUDA] Implements get_native interoperability #1332
[SYCL][PI][CUDA] Implements get_native interoperability #1332
Conversation
0ecfea1
to
e4e4441
Compare
e4e4441
to
5fb6161
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see how to merge with #1244
5fb6161
to
3406b3e
Compare
3406b3e
to
87e82b4
Compare
The API here only returns native handle from the Pi handle. To support interoperability, we should support APIs for conversion of native_handle->Pi handle. |
Me preference would be for separate API for creating PI objects from native handles, since there is not really anything shared between them. And if we stick with combining conversions to/from native handles in single API (as done in #1244) then we are up to adding remaining pi*Convert interfaces. |
For OpenCL, isn't it just a cast in either case, i.e. if we want the native handle the PI object is simply cast to it and the other way around. For CUDA I'm inclined to agree with you, but the two main issues I see with it is:
|
That is correct, as we've decided to base PI API on OpenCL API as much as possible, meaning the OpenCL PI plugin is really thin.
Not so much a bloat. Just 1 new API for every SYCL class. With program and device are already covered in #1244, there are really just several more.
While this is true, I don't see why this would affect our decision on the questions at hands. BEs are free to implement only subset of PI API. If CUDA BE, for example, is not going to support some of the general interoperability, then it is equally doable with a single or multiple "convert" interfaces. |
On that note, is there a point in not having the convert be two separate functions? If we want to split somewhat unrelated operations, why have the convert function be two-way?
My point here was mostly that each backend will have to implement all the convert functions, even when new are added, whereas having it as one will allow BEs to implement all that they currently support and default to some error if not. Having separate functions and not having them implemented in each backend will likely cause segfaults. |
My 5 cents: I agree with @steffenlarsen , having two APIs proposed by the patch look cleaner. |
How about a single API piGetNativeHandle to return the native handle as introduced in this patch, but separate APIs for constructing PI objects from native handles, i.e. piProgramCreateFromNative/piDeviceCreateFromNative/etc ? |
Why is it better than proposed solutions? To me it has worst from both. |
I think I could argue either way. Having separate functions allows using specific PI api types directly, and its more clear what is happening. However, it adds much more surface to the API for no particular reason. |
Separate APIs would follow the spirit of what PI API is currently doing: separate API for PI objects creation -- pi*Create. There are also talks to make PI API out of SYCL internals and serve the purpose of integrating other run-times with the low-level drivers. I'd rather not "pollute" the API for no good reasons (maintenance cost difference is negligible between the options we are discussing). |
7933926
to
8c05f76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Last comment from me.
Please, align with @smaslov-intel on the number of PI APIs.
e05eaeb
to
11c2ce6
Compare
Great thanks for doing this!
I totally agree that the created PI objects should not be "hanging", so a reference-counter should be increased (or started where it didn't exist before). But please clarify that in the API description, that it is not just +1 always.
This is a worrying one. We can equally choose not taking the ownership of native objects, and just release PI object after it's reference-count gets to 0, while leaving it up to the user who created/provided the native handle to release its handles. Not sure how they would know that PI no longer uses his handles, though. Anyways, we can think/re-define this later, but in this PR please clarify this important aspect in the PI API descriptions. |
8fcb3c4
to
2cb153c
Compare
Thank you for the feedback. I've adjusted it accordingly.
You're absolutely right. Either way will require user-knowledge, but since this is interoperability I think we can rely on the user not using it completely blindly. The decision should however probably be documented when taken, but as you also suggest this is best left for when the functions are actually implemented in the CUDA backend. |
f2a0879
to
8103c1e
Compare
Implements get_native for CUDA allowing queries for native handles on SYCL objects; queue, event, context, and device. Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com>
8103c1e
to
2a1d463
Compare
This commit makes get_mem and get_queue of interop_handler return types based on a specified backend. The backend defaults to OpenCL to avoid breakages. Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com>
Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com>
PI objects created with the associated piextCreate*WithNativeHandle function takes ownership of the native handle passed to it. For OpenCL this means that the native OpenCL object is not retained, so the PI object effectively owns the reference of the caller. The OpenCL interoperability have been changed to retain the native OpenCL handle as the implementation no longer does it. This is required by the SYCL 1.2.1 specification (Rev6, section 4.3.1.) Signed-off-by: Steffen Larsen <steffen.larsen@codeplay.com>
2a1d463
to
9a67949
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I would like @smaslov-intel to approve as well before merge
…duler_docs * origin/sycl: [SYCL][PI][CUDA] Implements get_native interoperability (intel#1332) [SYCL] Fix check-sycl test suite on systems w/o OpenCL (intel#1503) [SYCL][Doc] Update ExtendedAtomics documentation (intel#1487) [SYCL][CUDA] Expose context extended deleters on PI API (intel#1483) [SYCL][NFC] Remove a dropped environment variable from a test (intel#1506) [SYCL] Add opencl-aot to sycl-toolchain target (intel#1504) [SYCL] Allow to run deploy LIT tests from particular directory [SYCL][CUDA] Fix LIT testing with CUDA devices (intel#1300) [SYCL] Remove operator name keywords (intel#1501) [Driver][SYCL] Consider .lo files as static archives (intel#1500) [SYCL-PTX] Update the compiler design to describe the CUDA target (intel#1408) [SYCL] Fix library build on Windows (intel#1499) [SYCL][NFC] Refactor lit.cfg.py (intel#1452) [SYCL] Fixed sub-buffer memory allocation update (intel#1486) [SYCL] Ensure proper definition of spirv builtins for SYCL (intel#1393) [SYCL][CUDA] LIT XFAIL/UNSUPPORTED (intel#1303) [SYCL][Doc] Function-type kernel attribute extension (intel#1494)
…c_abi_checks * origin/sycl: (32 commits) [SYCL] Do not force LLVM_INCLUDE_TESTS variable (intel#1505) [SYCL][NFC] Align nd_item members with constructor initialization list (intel#1521) [SYCL] Move get_info_host implementation to header (intel#1514) [SYCL] Always use dynamic CRT for Unit tests (intel#1515) [SYCL][NFC] Temporarily disable sporadically failing test (intel#1526) [SYCL] Fix inline namespaces (intel#1525) [SYCL] Release notes for March'20 DPCPP implementation update (intel#1511) [SYCL][PI][CUDA] Implements get_native interoperability (intel#1332) [SYCL] Fix check-sycl test suite on systems w/o OpenCL (intel#1503) [SYCL][Doc] Update ExtendedAtomics documentation (intel#1487) [SYCL][CUDA] Expose context extended deleters on PI API (intel#1483) [SYCL][NFC] Remove a dropped environment variable from a test (intel#1506) [SYCL] Add opencl-aot to sycl-toolchain target (intel#1504) [SYCL] Allow to run deploy LIT tests from particular directory [SYCL][CUDA] Fix LIT testing with CUDA devices (intel#1300) [SYCL] Remove operator name keywords (intel#1501) [Driver][SYCL] Consider .lo files as static archives (intel#1500) [SYCL-PTX] Update the compiler design to describe the CUDA target (intel#1408) [SYCL] Fix library build on Windows (intel#1499) [SYCL][NFC] Refactor lit.cfg.py (intel#1452) ...
Implements get_native for CUDA allowing queries for native handles on
SYCL objects; queue, event, context, and device.
This is based on the SYCL generalization proposal.
Signed-off-by: Steffen Larsen steffen.larsen@codeplay.com