Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MIGraphX EP] Fix CopyTensorAsync and add guards for stream sync Copy… #13

Conversation

TedThemistokleous
Copy link

…Tensors (microsoft#16787)

Add compile guards to gate functionality based on MIGRAPHX_STREAM_SYNC for adding the following

  • remove excess hipStreamSyncronize to nullstream on CopyTensor calls
  • Add proper call for stream synchronized CopyTensorAsync for DeviceToHost case

Without this change subsequent CopyTensorAsync() calls will fail for cards that don't use pinned memory thus causing hipMemcpy() calls to occur before certain kernel operations occur.

image

becomes

image


Description

Motivation and Context

@TedThemistokleous TedThemistokleous added the bug Something isn't working label Jul 25, 2023
@TedThemistokleous TedThemistokleous self-assigned this Jul 25, 2023
@TedThemistokleous TedThemistokleous changed the base branch from main to rocm5.6_internal_testing July 25, 2023 16:31
…Tensors (microsoft#16787)

Add compile guards to gate functionality based on MIGRAPHX_STREAM_SYNC
for adding the following

- remove excess hipStreamSyncronize to nullstream on CopyTensor calls
- Add proper call for stream synchronized CopyTensorAsync for
DeviceToHost case

Without this change subsequent CopyTensorAsync() calls will fail for
cards that don't use pinned memory thus causing hipMemcpy() calls to
occur before certain kernel operations occur.

![image](https://github.com/microsoft/onnxruntime/assets/107195283/4915c18a-fb2d-40c9-a50e-a7c6613c324b)

becomes

![image](https://github.com/microsoft/onnxruntime/assets/107195283/f661acf4-e2af-4c9a-b26a-30fca339cf1d)

---------

Co-authored-by: Ted Themistokleous <tthemist@amd.com>
@TedThemistokleous TedThemistokleous force-pushed the migx_gpu_async_copy_fix_cherry_pick branch from e8576fd to d2c309a Compare July 25, 2023 19:17
@jeffdaily jeffdaily merged commit 85ce0c1 into ROCm:rocm5.6_internal_testing Jul 25, 2023
@TedThemistokleous TedThemistokleous deleted the migx_gpu_async_copy_fix_cherry_pick branch July 26, 2023 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants