Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

In place select #503

Merged

Conversation

gevtushenko
Copy link
Collaborator

@gevtushenko gevtushenko commented Jun 3, 2022

Select algorithms are based on the decoupled look back approach. Therefore, subsequent thread blocks are guaranteed to write data strictly after input data was read. Unlike partition family, select writes data only from one side of the array. It's safe to have in iterator equal to the out one. The only issue is LOAD_LDG that is used by default. LOAD_LDG replacement with LOAD_CA leads to 50% slowdown on Kepler and about 30% slowdown on Maxwell. To avoid performance regression on these architectures I've forbidden in-place execution and left LOAD_LDG. Since it'd be unfortunate to loose in-place option, I introduced in-place overload that takes exactly one argument. This also addresses the following issue.

The unique subset of algorithms reads data outside of thread block tile. This leads to data races. It's possible to introduce in-place version but it'd require more work (caching pre-tile data in temporary storage).

@alliepiper alliepiper added type: enhancement New feature or request. P1: should have Necessary, but not critical. area: docs Related to documentation. area: tests Related to tests / test infrastructure. labels Jun 3, 2022
@alliepiper alliepiper added this to the 2.0.0 milestone Jun 3, 2022
@alliepiper
Copy link
Collaborator

LGTM -- run this through gpuCI to validate the new tests before merging.

gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Jun 4, 2022
@gevtushenko gevtushenko added testing: gpuCI in progress Started gpuCI testing. testing: gpuCI passed Passed gpuCI testing. and removed testing: gpuCI in progress Started gpuCI testing. labels Jun 4, 2022
@gevtushenko gevtushenko force-pushed the enh-main/github/in_place_select branch from 10ebe39 to 7896eb4 Compare June 4, 2022 16:49
@gevtushenko gevtushenko merged commit 92b501a into NVIDIA:main Jun 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: docs Related to documentation. area: tests Related to tests / test infrastructure. P1: should have Necessary, but not critical. testing: gpuCI passed Passed gpuCI testing. type: enhancement New feature or request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants