Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace direct cudaMemcpyAsync calls with utility functions (within /src) #17550

Merged
merged 27 commits into from
Dec 16, 2024

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Dec 9, 2024

Description

Replaced the calls to cudaMemcpyAsync with the new cuda_memcpy/cuda_memcpy_async utility, which optionally avoids using the copy engine.

Also took the opportunity to use cudf::detail::host_vector and its factories to enable wider pinned memory use.

Remaining instances are either not viable (e.g. copying h_needs_fallback, interop) or D2D copies.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 9, 2024
@vuule vuule self-assigned this Dec 9, 2024
Copy link

copy-pr-bot bot commented Dec 9, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Dec 9, 2024
@@ -78,7 +78,7 @@ class device_scalar : public rmm::device_scalar<T> {
[[nodiscard]] T value(rmm::cuda_stream_view stream) const
{
cuda_memcpy<T>(bounce_buffer, device_span<T const>{this->data(), 1}, stream);
return bounce_buffer[0];
return std::move(bounce_buffer[0]);
Copy link
Contributor Author

@vuule vuule Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case we want to make a device_scalar of a non-copyable type.

@vuule vuule changed the title Replace direct cudaMemcpyAsync calls with utility functions (limited to /src) Replace direct cudaMemcpyAsync calls with utility functions (within /src) Dec 10, 2024
@vuule vuule marked this pull request as ready for review December 10, 2024 19:21
@vuule vuule requested a review from a team as a code owner December 10, 2024 19:21
cpp/src/bitmask/is_element_valid.cpp Outdated Show resolved Hide resolved
cpp/src/column/column_device_view.cu Outdated Show resolved Hide resolved
cpp/src/scalar/scalar.cpp Show resolved Hide resolved
vuule and others added 6 commits December 10, 2024 14:33
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
@vuule vuule requested a review from davidwendt December 10, 2024 22:45
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@vuule vuule requested a review from ttnghia December 14, 2024 00:12
@vuule vuule added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Dec 14, 2024
@vuule
Copy link
Contributor Author

vuule commented Dec 16, 2024

/merge

@rapids-bot rapids-bot bot merged commit a5ac4bf into rapidsai:branch-25.02 Dec 16, 2024
106 checks passed
@vuule vuule deleted the avoid-cudamemcpy-rest branch December 16, 2024 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants