-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UR program creation from multiple device binaries #2147
base: main
Are you sure you want to change the base?
Support UR program creation from multiple device binaries #2147
Conversation
89e8d9d
to
47fafe3
Compare
There's an ongoing effort to remove this extension by making it the default behaviour for the affected entry points #1195 as it was a workaround for an issue discovered late in the last release cycle. Since it seems like there's a compelling use case I think this should be a change to |
2b1cf6e
to
584869e
Compare
Thank you, I've changed the PR to modify the core |
550df8a
to
44a2f02
Compare
source/adapters/cuda/program.cpp
Outdated
|
||
UR_CHECK_ERROR( | ||
createProgram(hContext, hDevice, size, pBinary, pProperties, phProgram)); | ||
if (numDevices == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add this to the yml so it gets generated and used by all adapters, I think it should be an INVALID_SIZE
return:
diff --git a/scripts/core/program.yml b/scripts/core/program.yml
index 3fe2632c..2a0bc14a 100644
--- a/scripts/core/program.yml
+++ b/scripts/core/program.yml
@@ -158,6 +158,7 @@ returns:
- "`NULL != pProperties && pProperties->count > 0 && NULL == pProperties->pMetadatas`"
- $X_RESULT_ERROR_INVALID_SIZE:
- "`NULL != pProperties && NULL != pProperties->pMetadatas && pProperties->count == 0`"
+ - "`numDevices == 0`"
- $X_RESULT_ERROR_INVALID_NATIVE_BINARY:
- "If any binary in `ppBinaries` isn't a valid binary for the corresponding device in `phDevices.`"
--- #--------------------------------------------------------------------------
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the |
Oh, I see, thanks, I will check and fix that. |
Fixed remaining failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda/hip ok.
Currenlty we have multiple maps of device handle to some per-device data like the module handle, logs, binary etc. Instead of having separate map for each, just put that data into a separate structure and have one map.
b44bc43
to
82beeee
Compare
@nrspruit @pbalcer I have enabled existing "program" conformance test (f3fa464), as well as added new multi-device test (88fb72a) to verify |
@oneapi-src/unified-runtime-level-zero-write @oneapi-src/unified-runtime-native-cpu-write @oneapi-src/unified-runtime-command-buffer-write Could you please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for adding the UR tests!
To support multi-device AOT scenario in SYCL we need an ability to create UR program from multiple device binaries.
Changes in this PR:
urProgramCreateWithBinary
to support program creation from multiple device binaries.ur_program
to get/set per-device data like L0 module handle, build log etc. Otherwise any change in the structure of the class requires multiple changes in all UR functions which work withur_program
, in addition to this it allows to handle interop case (which implementation is an exception right now) in a single place. Also make State and some other info per-device because it is possible that UR program is associated with multiple devices andurProgramBuildExp
is called multiple times for subsets, and we have to know the state per-device.Corresponding intel/llvm PR with E2E tests: intel/llvm#15546
Also outlined minimal changes just to uplift UR tag: intel/llvm#15710