-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor API for unconventionally-indexed arrays #17137
Conversation
_similar(::IndicesBehavior, a::AbstractArray, T::Type) = similar(a, T, indices(a)) | ||
to_shape(::Tuple{}) = () | ||
to_shape(dims::Dims) = dims | ||
to_shape(dims::DimsOrInds) = map(to_shape, dims) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea (I don't know how convenient or cumbersome it'd be) would be to have a type Shape
of Shp
. This way this could changed to convert
methods so one can write Shape(dims)
and convert(Shape, dim)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_shape
is basically a compatibility call, packaging AbstractUnitRange
objects tuples as Dims
-tuples when possible (when the range can be guaranteed, via the type system, to start at 1). In that sense we already have the type(alias).
454e43f
to
3f7f844
Compare
I seem to have messed up the nanosoldier call, so let's try again after adding quotes: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
3f7f844
to
e1ef74a
Compare
I've canceled the CI here; while the runtime looks good, 1b684d5 ("eliminate the Master: julia> using Base.Test
shell> cd test/linalg/
/home/tim/src/julia-0.5b/test/linalg
julia> @time include("schur.jl")
17.479364 seconds (13.13 M allocations: 516.591 MB, 1.47% gc time)
julia> @time include("schur.jl")
1.011176 seconds (445.60 k allocations: 23.307 MB, 1.33% gc time) This PR: julia> using Base.Test
shell> cd test/linalg/
/home/tim/src/julia-0.5/test/linalg
julia> @time include("schur.jl")
44.836752 seconds (72.31 M allocations: 3.040 GB, 2.34% gc time)
julia> @time include("schur.jl")
0.902519 seconds (447.55 k allocations: 23.480 MB, 1.02% gc time) I'll see what I can figure out. Meanwhile, it's a nice test case for folks who like to look at compiler performance 😉 . |
Interesting observation: if I comment out these loops and just manually set the values, then master and this PR have similar compile times. But set diff --git a/test/linalg/schur.jl b/test/linalg/schur.jl
index 5370afd..02a07dc 100644
--- a/test/linalg/schur.jl
+++ b/test/linalg/schur.jl
@@ -16,11 +16,17 @@ srand(1234321)
areal = randn(n,n)/2
aimg = randn(n,n)/2
-for eltya in (Float32, Float64, Complex64, Complex128, Int)
+# for eltya in (Float32, Float64, Complex64, Complex128, Int)
+eltya = Int
a = eltya == Int ? rand(1:7, n, n) : convert(Matrix{eltya}, eltya <: Complex ? complex(areal, aimg) : areal)
asym = a'+a # symmetric indefinite
apd = a'*a # symmetric positive-definite
- for atype in ("Array", "SubArray")
+ indx = 1
+ atypes = ["Array", "SubArray"]
+ while indx <= length(atypes)
+# for atype in ("Array", "SubArray")
+ atype = atypes[indx]
+ indx += 1
if atype == "Array"
a = a
else
@@ -96,4 +102,4 @@ for eltya in (Float32, Float64, Complex64, Complex128, Int)
@test NS[:S] ≈ sS
@test NS[:Z] ≈ sZ
end
-end
+# end does not fix the performance, so this seems different from #16122. |
Here's a minimal reproducer: using Base.Test
a = rand(1:7, 10, 10)
for atype in ("Array", "SubArray")
d,v = eig(a)
end Master: julia> @time include("/tmp/tim/schur.jl")
1.740690 seconds (1.93 M allocations: 84.396 MB, 2.91% gc time) This PR: julia> @time include("/tmp/tim/schur.jl")
13.267818 seconds (46.09 M allocations: 1.985 GB, 5.30% gc time) But now comment out the |
Wow, interesting. What happens if you use an array instead of a tuple: |
In the Chinese curse-sense, yes. 😄
Still slow. I've debugged this a little further; one relevant point is that I can call
before +
+type InfRef
+ val::Bool
+end
+const debug = InfRef(false)
function typeinf_ext(linfo::LambdaInfo)
if isdefined(linfo, :def)
+ local tstart
+ if debug.val
+ print(linfo.def.name, " start: ", ccall(:jl_clock_now, Float64, ()))
+ end
# method lambda - infer this specialization via the method cache
(code, _t, inferred) = typeinf_edge(linfo.def, linfo.specTypes, linfo.sparam_vals, true, true, true, linfo)
if inferred && code.inferred && linfo !== code
@@ -1567,13 +1576,22 @@ function typeinf_ext(linfo::LambdaInfo)
linfo.inferred = true
linfo.inInference = false
end
+ if debug.val
+ print(linfo.def.name, " stop: ", ccall(:jl_clock_now, Float64, ()))
+ end
return code
else
# toplevel lambda - infer directly
+ if debug.val
+ println("toplevel inference")
+ end
linfo.inInference = true
frame = InferenceState(linfo, true, true)
typeinf_loop(frame)
@assert frame.inferred # TODO: deal with this better
+ if debug.val
+ println("toplevel inference done")
+ end
return linfo
end
end gives this output (having already called
and all the waiting is before the "toplevel inference done". |
宁為太平犬,莫做亂离人 Though the actual curse is English. On Wed, Jun 29, 2016 at 6:10 PM, Tim Holy notifications@github.com wrote:
|
Inference seems to be encountering a large number of different signatures for
|
Surprising, because there are exactly 6 definitions in this PR, fewer than the 8 there are in current Base. But of course they're called differently. The example you showed suggests something is calling it with 9 indices...given that we're talking about matrix algebra (2 dimensions), what the heck is that from? |
It's from recursion in the inference process itself. |
e1ef74a
to
9a8941e
Compare
@nanosoldier |
9a8941e
to
dfb8cfb
Compare
@jrevels, OK to start nanosoldier now? (It's this PR I'm most interested in right now.) |
The daily build ran successfully, so I think it's good now: @nanosoldier |
dfb8cfb
to
88359dc
Compare
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
88359dc
to
2dcef54
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Better to have this logic in the main functions rather than in specific packages
The reason to move earlier is to test whether the methods corrupt other operations.
Splatting forces dynamic method lookup in places where that's a major cost
This leads to noticeable performance improvements for several benchmarks
I have a good feeling (based on local benchmarking) about this one: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Fewer exports, cleaner API, fewer lines of code, and more performant---what's not to like? Advance warning, I'll be merging as soon as CI finishes---I don't have any reason to suspect that the one test failure (so far) has anything to do with this PR. |
It looks like this broke several packages with a |
Looks like the issue in DataArrays stems from it using |
If they need a quick-fix, they should be able to just call |
I'm stuck at JuliaStats/DataArrays.jl#205. Help appreciated. |
This is effectively "cleanup" from #16260. It gets rid of
shape
and renamesallocate_for
to besimilar
. It achieves these by introducing a new Range type,OneTo(n)
, which is equivalent (in a value-sense) to1:n
. These are changes recommended by @JeffBezanson. I'm hoping to avoid havingOneTo
"leak" out into user space, so it's not exported, but I suspect we may need to export it at some point (let's first see how far we can get without doing that).Despite the name of this branch, it does not make unconventional indices safer (ref
@arraysafe
in #16973); it seems better to do that in a separate PR.A key thing here is going to be to find out whether I've done something nasty for performance, so @nanosoldier
runbenchmarks(ALL, vs=:master)
.