This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 447
Add libcu++ dependency; initial round of NV_IF_TARGET
ports.
#448
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alliepiper
force-pushed
the
if_target_prep
branch
from
March 24, 2022 21:34
3886111
to
9414e43
Compare
alliepiper
changed the title
libcudacxx, if-target prep
Add libcu++ dependency; initial round of Mar 24, 2022
NV_IF_TARGET
ports.
alliepiper
force-pushed
the
if_target_prep
branch
from
April 4, 2022 13:40
9414e43
to
b1bbe02
Compare
alliepiper
added
type: enhancement
New feature or request.
P0: must have
Absolutely necessary. Critical issue, major blocker, etc.
helps: nvc++
Helps or needed by NVC++.
release: breaking change
Include in "Breaking Changes" section of release notes.
and removed
blocked
Currently cannot make progress.
labels
Apr 4, 2022
gevtushenko
suggested changes
Apr 11, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of code is much cleaner now, thanks! There are a few minor changes that need to be addressed.
alliepiper
force-pushed
the
if_target_prep
branch
from
April 13, 2022 21:09
b1bbe02
to
3efed83
Compare
robertmaynard
approved these changes
May 3, 2022
alliepiper
force-pushed
the
if_target_prep
branch
from
May 10, 2022 21:20
3efed83
to
b523fc5
Compare
gevtushenko
approved these changes
May 11, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan on how to address dynamic shared memory allocation without PTX_ARCH
when we support redux
?. If so, this can be merged.
We'll need to use |
alliepiper
force-pushed
the
if_target_prep
branch
from
May 16, 2022 21:26
b523fc5
to
f037174
Compare
nvc++ will stop defining __NVCOMPILER_CUDA_ARCH__ soon, removing the ability to determine the PTX arch at compile time. This updates agents and collective algorithms to no longer require the PTX_ARCH template parameter, and changes the CUB_WARP_SIZE(PTX_ARCH), etc helpers to ignore their argument. These macros only differed on obsolete arches and have no effect on currently supported architectures.
This fixes the issue reported in NVIDIA#299. There's no clear reason why this should use `RandomBits` unconditionally.
The merge sort test with pow2 >20 fails on GTX 1650. Detect bad_alloc failures and skip those tests. Tests for smaller problem sizes will still fail if there's a bad_alloc.
alliepiper
force-pushed
the
if_target_prep
branch
from
May 16, 2022 22:05
f037174
to
4de961a
Compare
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
helps: nvc++
Helps or needed by NVC++.
P0: must have
Absolutely necessary. Critical issue, major blocker, etc.
release: breaking change
Include in "Breaking Changes" section of release notes.
type: enhancement
New feature or request.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Requires NVIDIA/thrust#1605.
This PR contains an initial set of changes necessary to migrate Thrust and CUB to NV_IF_TARGET and remove dependence on
__CUDA_ARCH__
. It does not fully remove all usages of__CUDA_ARCH__
, but rather focuses on the following:#ifdef __CUDA_ARCH__
to useNV_IF_TARGET
.This also includes various bug fixes for issues exposed by the above.
Future PRs will address the remaining usages of
__CUDA_ARCH__
in the CDP macros and the kernel dispatch infrastructure.Pre-written Release Notes
Breaking Changes
NV_IF_TARGET
ports. #448 Add libcu++ dependency.NV_IF_TARGET
ports. #448: The following macros are no longer defined by default. They can be re-enabled by definingCUB_PROVIDE_LEGACY_ARCH_MACROS
. These will be completely removed in a future release.CUB_IS_HOST_CODE
: Replace withNV_IF_TARGET
.CUB_IS_DEVICE_CODE
: Replace withNV_IF_TARGET
.CUB_INCLUDE_HOST_CODE
: Replace withNV_IF_TARGET
.CUB_INCLUDE_DEVICE_CODE
: Replace withNV_IF_TARGET
.Other Enhancements
NV_IF_TARGET
ports. #448: Removed special case code for unsupported CUDA architectures.NV_IF_TARGET
ports. #448: Replace several usages of__CUDA_ARCH__
with<nv/target>
to handle host/device code divergence.NV_IF_TARGET
ports. #448: Mark unused PTX arch parameters as legacy.