-
Notifications
You must be signed in to change notification settings - Fork 6
Conversation
Codecov Report
@@ Coverage Diff @@
## master #30 +/- ##
=========================================
+ Coverage 80.68% 81.1% +0.41%
=========================================
Files 8 8
Lines 497 508 +11
=========================================
+ Hits 401 412 +11
Misses 96 96
Continue to review full report at Codecov.
|
Yes, I can check if this works better. I'm actually trying out my own version as well! |
Ready to merge? |
Benchmark Report for YaoArrayRegisterJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
|
@GiggleLiu it seems that there are some regressions, do you have time to investigate them? |
src/instruct.jl
Outdated
end | ||
operator = sort_unitary(operator, locs) | ||
locs_raw, ic = _prepare_instruct(state, operator, locs, control_locs, control_bits) | ||
return _instruct!(state, autostatic(operator), locs_raw, ic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to transpose the operator and state for general case in order to make memory contiguous for BLAS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state is transposed, transposing the operator is not nessesary. After all, we don't have BLAS...
it seems there are some performance regression against ["specialized", "single qubit multi control", "(\"T\", 24, 8)"] can you also run the benchmark on your laptop? some of the regression might just be noise. |
batched register benchmarks are not defined yet, I'll do that later, meanwhile you can investigate the regression on single batch. |
On Delta 102, the 1.5 time regresion looks not true # master branch
julia> results = run(SUITE["matrices"]["in-contiguous"]["ordered"][(1, "Complex{Float64}", "PermMatrix{Complex{Float64},Int64,Array{Complex{Float64},1},Array{Int64,1}}")])
BenchmarkTools.Trial:
memory estimate: 288 bytes
allocs estimate: 4
--------------
minimum time: 1.310 μs (0.00% GC)
median time: 1.740 μs (0.00% GC)
mean time: 1.622 μs (0.00% GC)
maximum time: 17.520 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1
# this branch
julia> results = run(SUITE["matrices"]["in-contiguous"]["ordered"][(1, "Complex{Float64}", "PermMatrix{Complex{Float64},Int64,Array{Complex{Float64},1},Array{Int64,1}}")])
BenchmarkTools.Trial:
memory estimate: 288 bytes
allocs estimate: 4
--------------
minimum time: 1.135 μs (0.00% GC)
median time: 1.188 μs (0.00% GC)
mean time: 13.748 μs (0.00% GC)
maximum time: 125.430 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1 |
else | ||
raw = zeros(T, nbatch, 1<<total) | ||
raw[:,Int(bit_config)+1] .= 1 | ||
raw = transpose(raw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should copy here? since transpose
is lazy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and state
will always return a transpose
ed version so state(r) * mat
still works in YaoBlocks? and add a new function storage
to return the raw storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need Tranpose wrapper here to create a effective row major array.
This PR currently breaks Test Summary: | Pass Error Total
test composite block | 258 7 265
test chain | 62 62
test kron | 44 44
test control | 28 28
test put | 26 7 33
construct | 1 1
apply! | 7 5 12
test swap gate | 17 17
rotation gate | 1 2 3
test repeated | 7 7
test concentrate | 11 11
test tag | 59 59
test pauli string | 15 15
test single block chsubblocks | 4 4
ERROR: LoadError: Some tests did not pass: 258 passed, 0 failed, 7 errored, 0 broken.
in expression starting at /Users/roger/.julia/dev/YaoBlocks/test/runtests.jl:7 @GiggleLiu do you have time to fix it? |
reverted @Roger-luo |
Tranposed storage are more batch friendly, with better performance.
All changes are included in this commit: 86d74bd and 17ec5df
@Roger-luo Could you please do benchmark on this branch, and see how it work?