Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Support sycl::kernel_bundle for multi-device scenario #15546

Open
wants to merge 3 commits into
base: sycl
Choose a base branch
from

Conversation

againull
Copy link
Contributor

@againull againull commented Sep 28, 2024

This PR includes:

  • Changes in the program manager methods to be able to properly create/build UR program for multiple devices. So far, we were mostly using only the first device in the vector to create/build UR program which made UR program unusable on other devices.

  • UR tag update brings the version of urProgramCreateWithBinary which allows
    to create UR program from multiple device binaries.

  • Our program cache key allowed only a single device. I have changed it to contain a set of devices. If UR program is created and built for a set of devices then the same UR program is usable whenver we have any subset of this set. That's why if we have a program built for a set of devices then add all subsets to the cache. Before we were adding a record to the cache for each device from the set which is incorrect. For example, if someone requests a UR program for {dev2, dev3} from the cache then it is expected that this UR progam must be usable to submit a kernel to dev3. But we could get a program for {dev1, dev2} from the cache which is unusable on dev3.

@againull againull force-pushed the sycl_bundle_multi_device branch 2 times, most recently from 3cfe9b7 to 5e4b86e Compare September 30, 2024 20:18
@againull againull marked this pull request as ready for review October 1, 2024 08:06
@againull againull requested review from a team as code owners October 1, 2024 08:06
This PR includes:
* Changes in the program manager methods to be able to properly
  create/build UR program for multiple devices. So far, we were mostly
  using only the first device in the vector to create/build UR program
  which made UR program unusable on other devices.

* UR tag update brings the version of urProgramCreateWithBinary which allows
  to create UR program from multiple device binaries.

* Our program cache key allowed only a single device. I have changed it
  to contain a set of devices. If UR program is created and built for a
  set of devices then the same UR program is usable whenver we have any
  subset of this set. That's why if we have a program built for a set of
  devices then add all subsets to the cache. Before we were adding a
  record to the cache for each device from the set which is incorrect.
  For example, if someone requests a UR program for {dev2, dev3} from
  the cache then it is expected that this UR progam must be usable to
  submit a kernel to dev3. But we could get a program for
  {dev1, dev2} from the cache which is unusable on dev3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants