Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Fix regression of temporary_allocator with non-CUDA back ends #1103

Merged
merged 1 commit into from
Apr 16, 2020
Merged

Fix regression of temporary_allocator with non-CUDA back ends #1103

merged 1 commit into from
Apr 16, 2020

Conversation

dkolsen-pgi
Copy link
Collaborator

A recent cleanup of functions that have different code for host and device
accidentally broke temporary_allocator for non-CUDA back ends. An #if
condition that protects a piece of code that only works with the CUDA back
end was incorrectly removed. This fix adds that condition back in.

Fix issue #1101

A recent cleanup of functions that have different code for host and device
accidentally broke temporary_allocator for non-CUDA back ends.  An #if
condition that protects a piece of code that only works with the CUDA back
end was incorrectly removed.  This fix adds that condition back in.
@alliepiper
Copy link
Collaborator

Should THRUST_INCLUDE_DEVICE_CODE only be set when using the cuda device? Maybe we should fix this at the macro definition and rename the macro to THRUST_INCLUDE_CUDA_DEVICE_CODE. This would disambiguate between thrust's device system and cuda device code.

Thoughts?

@dkolsen-pgi
Copy link
Collaborator Author

Tweaking the definition of THRUST_INCLUDE_DEVICE_CODE was my first thought. But this place in temporary_allocator is the only place in Thrust code that has a special check for THRUST_DEVICE_SYSTEM_CUDA. Everywhere else generates device code normally even in THRUST_DEVICE_SYSTEM_CPP mode. So after looking into it more, I didn't want to change the general macro, and instead retained the special case code that had been within temporary_allocator.

@alliepiper
Copy link
Collaborator

Sounds reasonable, LGTM. I'll land this sometime tomorrow 👍

@alliepiper alliepiper self-assigned this Apr 15, 2020
@griwes
Copy link
Collaborator

griwes commented Apr 15, 2020

Now that I've seen this change, it seems a bit unfortunate that we are using the word "device" to mean different things in different contexts (in the context of Thrust types, we use it to mean "in the domain of the device backend", while in the context of the macros, we use it to mean "on the GPU" instead). It's probably not something we should be concerned about right now, but we should think about this.

@hwinkler can you please confirm that this patch solves the issue?

@alliepiper
Copy link
Collaborator

Shelve 28262454.

@hwinkler
Copy link
Contributor

hwinkler commented Apr 15, 2020

@griwes the patch checks out here. Compiles our codebase for CPP and OMP.

Update: and CUDA.

Update 2: Sorry, not working after all.

@hwinkler
Copy link
Contributor

wait... cancel that approval... sorry, I spoke too quickly.
please stand by... our own tests failing under cuda

@hwinkler
Copy link
Contributor

Sorry for the scare. All is well. The patch does check out fine.

Testing our release build works great. The cause of the scare was, I ran the tests under our debug switches, and we've always had certain unit tests that fail under that config.

@alliepiper
Copy link
Collaborator

Thanks for confirming, @hwinkler!

@alliepiper alliepiper merged commit 915c270 into NVIDIA:master Apr 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants