Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental support of cuQuantum #1400
Add experimental support of cuQuantum #1400
Changes from 21 commits
309c73d
54dc128
a5bc75e
b1bd96e
a40898c
adfc125
26c4538
87afff5
f16a35c
5533b76
0c10325
181eb2c
54d1a68
4d502ed
eba2594
5a93807
983773b
5bea04d
1fb5031
1d01542
da0f42d
c781208
0f4a93e
c509131
5458b7c
61083cb
046036d
3a31cef
de4c978
88d7d95
3ffabcf
7cf50ee
879a4ac
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if
cuStateVec_enable=True
is configured inAerSimulator.run()
,parallel_state_update_
is not set. This will produce performance regression if application accidientaly setscuStateVec_enable
withdevice='CPU'
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: when
enable_batch_multi_shots_=true
would you createnShots
copies of the statevector for parallelization? If so & IIUC, I think a proper "workaround" is to create multiple cuStateVec handles (or just retain and reuse a pool of handles at init time to reduce overhead) and use them in parallel.IMHO though it's beyond a "workaround": even after we fix the thread safety issue, generally speaking it is still challenging for library handles to be shared by multiple host threads. For example, despite cuBLAS supports this usage pattern they explicitly recommend to not do so. Thus the handle pool approach is commonly seen in ML/DL frameworks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enable_batch_multi_shots_=true
is not applicable for cuStateVec currently, because multiple state vectors are calculated in a single CUDA kernel and each state vector refers classical registers to handle branch operations, this is not implemented in cuStateVec.Multiple cuStateVec handle is required when
enable_batch_multi_shots_=false
and shot level parallelization is required. In this case, state vectors are independently calculated using OpenMP threads. (Currently cuStateVec is not thread safe and we disable OpenMP parallelization)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explanation @doichanj. I understand better now. So once we fix thread safety we can unblock you for the shot-level parallelization.