Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove eager synchronization with HtoD copies. #2625

Merged
merged 1 commit into from
Jan 17, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 4 additions & 12 deletions src/array.jl
Original file line number Diff line number Diff line change
Expand Up @@ -527,12 +527,9 @@ Base.copyto!(dest::DenseCuArray{T}, src::DenseCuArray{T}) where {T} =
function Base.unsafe_copyto!(dest::DenseCuArray{T}, doffs,
src::Array{T}, soffs, n) where T
context!(context(dest)) do
# operations on unpinned memory cannot be executed asynchronously, and synchronize
# without yielding back to the Julia scheduler. prevent that by eagerly synchronizing.
if use_nonblocking_synchronization
is_pinned(pointer(src)) || synchronize()
end

# the copy below may block in `libcuda`, so it'd be good to perform a nonblocking
# synchronization here, but the exact cases are hard to know and detect (e.g., unpinned
# memory normally blocks, but not for all sizes, and not on all memory architectures).
Comment on lines +530 to +532
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the copy below may block in `libcuda`, so it'd be good to perform a nonblocking
# synchronization here, but the exact cases are hard to know and detect (e.g., unpinned
# memory normally blocks, but not for all sizes, and not on all memory architectures).
# the copy below may block in `libcuda`, so it'd be good to perform a nonblocking
# synchronization here, but the exact cases are hard to know and detect (e.g., unpinned
# memory normally blocks, but not for all sizes, and not on all memory architectures).

GC.@preserve src dest begin
unsafe_copyto!(pointer(dest, doffs), pointer(src, soffs), n; async=true)
if Base.isbitsunion(T)
Expand All @@ -546,12 +543,7 @@ end
function Base.unsafe_copyto!(dest::Array{T}, doffs,
src::DenseCuArray{T}, soffs, n) where T
context!(context(src)) do
# operations on unpinned memory cannot be executed asynchronously, and synchronize
# without yielding back to the Julia scheduler. prevent that by eagerly synchronizing.
if use_nonblocking_synchronization
is_pinned(pointer(dest)) || synchronize()
end

# the copy below may block in `libcuda`; see the note above.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the copy below may block in `libcuda`; see the note above.
# the copy below may block in `libcuda`; see the note above.

GC.@preserve src dest begin
# semantically, it is not safe for this operation to execute asynchronously, because
# the Array may be collected before the copy starts executing. However, when using
Expand Down
Loading