Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses Thrust 1.10 breaking changes. Thrust 1.10 landed in CUDA 11.2 and will land in ROCm 4.2.
As discussed in NVIDIA/thrust#1379 (that internally references NVIDIA/thrust#1176), the behavior of
exclusive_scan
andinclusive_scan
changed in the case where the input types and output types were not the same. There is no deprecation warning or error thrown by the compiler. Indeed, with Thrust 1.10, before this PR, theexclusive_scan
calls that hadmake_transform_iterator
used in the input types (silently!) generated incorrect results. That means HYPRE is broken on GPUs today with CUDA 11.2 without this PR. There are a couple of way to fix, but what I did for the fix forexclusive_scan
was to just use the API where one specifies the initial value and that was enough to fix the issue. It did not appear to me that there were anyinclusive_scan
calls affected in HYPRE. And my tests with this PR with a ROCm 4.2 release candidate pass.In addition, Thrust 1.10 deprecated the use of C++ before C++14 so I've added
-std=c++14
to theHIPCXXFLAGS
argument in theconfigure.in
(and bootstrapped).Thrust 1.12 introduces similar breakages for the
scan_by_key
cousins, see NVIDIA/thrust#1376. The fixes are similar and I dropped in explicit casts toHYPRE_BigInt
in the (already existing) initial value forexclusive_scan_by_key
(commit 9a3bb66). I've not tried to addressinclusive_scan_by_key
cases, but I do believe they will be broken. I strongly recommend adding unit tests for those calls and adding them to the test suite. Thrust 1.12 is supposedly going to land in CUDA 11.4 (at least according to that thrust release page). I do not know when it will land in a ROCm release.Thank you.