Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] convert sycl::device to sycl::device* for better handling #504

Merged
merged 6 commits into from
Nov 5, 2021

Conversation

abagusetty
Copy link
Contributor

No description provided.

// single-tile GPUs
else {
if (local_nDevices == device_id) {
hypre_HandleDevice(hypre_handle_) = &(gpu_devices[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set hypre_HandleDevice(hypre_handle_) here, you need to make sure it is not overwritten below (line 143). Also, if you want to use hyrpe_SetDevice(), you need to do corresponding implementation for hypre_GetDevice(), which still has no implementation for sycl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, hypre_GetDevice() method is still undefined for SYCL. I guess #if defined(HYPRE_USING_GPU) && !defined(HYPRE_USING_SYCL) fixes the resetting the value.

auto subDevicesDomainNuma = gpu_devices[i].create_sub_devices<sycl::info::partition_property::partition_by_affinity_domain>(sycl::info::partition_affinity_domain::numa);
for (auto &tile : subDevicesDomainNuma) {
if (local_nDevices == device_id) {
hypre_HandleDevice(hypre_handle_) = &tile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this consider each tile to be a separate device? What is the effect of this vs. setting the environment variable that Brian had mentioned: export ZE_AFFINITY_MASK=0.0? Does this do the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does the same but just generalizes. ZE_AFFINITY_MASK=0.0 is designed to be set for device ID 0 and tile 0 and not to be a scalable solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Is there ever a case where you would want to treat all tiles as a single device? Either way, I think I'm happy using this for now at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treating a device as 2 separate tiles (explicit scaling) and device as a whole (implicit scaling) are two options. Since majority of the Apps interacting with Hypre are exploring explicit mode, hence the separate tiles. But I want to add implicit scaling too. I will add macros like the following under HYPRE_USING_SYCL for a user to choose either explicit or implicit.

For e.g., HYPRE_SYCL_EXPLICIT and HYPRE_SYCL_IMPLICIT

else {
nDevices++;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code above looks like it replicates the new implementation in hypre_GetDeviceCount(). Can we just call that here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@abagusetty
Copy link
Contributor Author

@waynemitchell Are there any changes for this PR

@waynemitchell
Copy link
Contributor

@abagusetty Looks good. I'll go ahead and bring this in.

@waynemitchell waynemitchell merged commit 3254e31 into hypre-space:sycl Nov 5, 2021
@abagusetty abagusetty deleted the wayne_sycl branch November 5, 2021 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants