-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NDTensorscuTENSORExt] Temporary fix for block sparse contractions #1485
Conversation
I think using a single function with if-statements would be better here. This is a temporary workaround so there isn't a reason to get too fancy with traits, and the offset is runtime information anyway so it is more natural to deal with it at runtime. So for example: arrayR = array(R)
if !iszero(array(R).offset)
arrayR = copy(arrayR)
end
cuR = CuTensor(arrayR, collect(labelsR))
# ...
if !iszero(array(R).offset)
array(R) .= cuR.data
end Then we can remove the if-statements once JuliaGPU/CUDA.jl#2407 is fixed. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1485 +/- ##
===========================================
+ Coverage 43.65% 77.42% +33.77%
===========================================
Files 136 140 +4
Lines 8806 9126 +320
===========================================
+ Hits 3844 7066 +3222
+ Misses 4962 2060 -2902 ☔ View full report in Codecov by Sentry. |
Looks good, thanks. Is this ready to merge once tests pass? |
@mtfishman I ran a timing of the 2d-Heisenberg system in ITensorMPS examples and see comparable timings for both CUDA and cuTENSOR
cuTENSOR
|
Yes ready to merge when tests pass! |
Glad to see the timings are comparable, I'll be curious to see what the timings are once we can remove the calls to Have you tried profiling or timing (https://cuda.juliagpu.org/stable/development/profiling) to get an estimate of how much time the calls to |
No I haven't yet don't any indepth timings of the function just a quick dmrg calculation to see if it was significantly effecting performance in a known case where GPU finds improvement over CPU. |
Description
Quick fix by copying all the viewed blocks and rewriting the block back to fix cuTENSOR problem.
Checklist: