-
Notifications
You must be signed in to change notification settings - Fork 447
Use P2322R6 to determine intermediate types for relevant algorithms #428
Comments
Agreed -- this should use either the initial value type or input iterator value type for consistency with https://wg21.link/P0571. We made this change a while ago for the scan algorithms and need to update the others as well. |
cub::DeviceSegmentedReduce::Reduce
with different input and output types
Updated the title to be more general, as we should verify that the other algorithms listed in the proposal are following these conventions, too. |
@brycelelbach mentioned that there's an updated version of this proposal under a different name -- check with him to get the latest recommendations. |
The alternative approach is described in P2322R6 |
There's a newer proposal than P0571 that we'll be using (@senior-zero linked this paper above). The intermediate type is determined by the result of the operator. |
Relevant changes were merged in the following PRs: |
The
reduction_op
is called with a type derived from the output iterator. Example:If this snippet is compiled with CUDA 11.4.48, CUB on commit 93f26ab and Thrust on commit 0b00326becfdd7a78182b36d0752c41b341863b2, which represent the current state of the default branches of CUB and Thrust, you recieve the error:
The compilation also fails for the CUB and Thrust installation, which comes with the CUDA Toolkit 11.4.48.
Clearly, the reduction operator is invoked with a type based on
result_t
. The documentation currently says:I think it is intuitive to assume that the reduction operator is called with a type derived from the input iterator and not from the output iterator. So maybe the documentation can state more precisely how type
T
is derived.The text was updated successfully, but these errors were encountered: