Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NTuples made me sad (so I nixed them) #11242

Merged
merged 16 commits into from
May 5, 2016
Merged

NTuples made me sad (so I nixed them) #11242

merged 16 commits into from
May 5, 2016

Conversation

timholy
Copy link
Member

@timholy timholy commented May 12, 2015

This rabbit hole turned out to be a giant dungeon, instead, and I haven't yet found the exit. I think I'm very close, but I've spent far too much time on this and I'm beginning to run out of ideas and energy., but yay, I found the exit.

This is the latest attempt at NTuple{N,T} -> Tuple{Vararg{T,N}}, aka #10911, #10691. Julia builds and passes many all tests, but borks on some (e.g., subarray). The current problem seems to be that a VarState(Tuple{_<:Vararg{T,N}},false) gets injected somewhere (leading ultimately to an attempted Tuple{Vararg, ASCIIString} and Vararg type in non-final position error). It's probably a 3-line fix, but I have not yet succeeded in finding the correct 3 lines (inference.jl is difficult for not-Jeff to hack on).

This still containsThe intermediate commits contain a lot of debugging code that will eventually be was deleted in the later commits. In the end, the way I got this farto work was to write new versions of jl_intersect_type, jl_subtype_le, type_morespecific, and type_match_, which (1) called the old algorithm and stored the result, then (2) translated NTuples on-the-fly and then called the new algorithm with the translated arguments, and (3) printed some output if the two disagreed. This branch is partway through the deletion of all that debugging code; before completing the migration I'd love to figure out where the problem is.

It turns out that we have enough differences in how NTuples and Tuple{Vararg}s behave that some disagreements are unavoidable; one of the best things about this, presumably, is that such differences will go away. I even went to the trouble to catalog the differences; in the "best" column I use "N" to indicate that I think the new algorithm yields the most sensible result, and "O" to indicate the old algorithm. Only two of those four algorithms had differences, of which representative examples are:

jl_intersect_type:
none (as measured by a <: b && b <: a)

jl_subtype_le:

a b ta invariant old result new result best
_<:Tuple Tuple 0 0 0 1 N
_<:Tuple{_<:Vararg{T<:Any, N<:Any}} Tuple{Vararg{#T<:Any, N<:Any}} 0 0 0 1 N?
Tuple{} NTuple{#_<:Any, #T<:Any} 0 1 1 01 Osame
Tuple{Tuple{_<:Tuple{Any, Any}}} Tuple{Tuple{Tuple{Any, Any}}} 0 0 0 1 N?
_<:Tuple{Vararg{Int64, #N<:Any}} Tuple{Vararg{#T<:Any, #_<:Any}} 0 0 0 1 N?
Tuple{Symbol, Integer} Tuple{Vararg{#T<:Any, N<:Any}} 0 1 0 1 ?
Tuple{Char, Char} Tuple{Vararg{#T<:Any, N<:Any}} 0 1 0 1 N
Tuple{Type{Tuple{Int64, Int64}}} Tuple{Type{Tuple{Int64, Vararg{Int64, N<:Any}}}} 0 0 0 1 N?
Tuple{Type{Tuple{Char}}, Tuple{Vararg{Any, N<:Any}}} Tuple{Type{Tuple{Vararg{#T<:Any, N<:Any}}}, Tuple} 0 0 0 1 N
Tuple{Tuple{_<:Tuple{Vararg{Any, N<:Any}}}, Int64} Tuple{Tuple{Tuple{Vararg{Any, N<:Any}}}, Int64} 0 0 0 1 N?
Tuple{Tuple{Symbol, Vararg{Symbol, N<:Any}}, DataType, Type{T<:Any}} Tuple{Vararg{#T<:Any, N<:Any}} 0 1 0 1 N
_<:Tuple{Symbol, Vararg{Symbol, N<:Any}} Tuple 0 0 0 1 N
Tuple{DataType, Type} Tuple{Vararg{#T<:Any, N<:Any}} 0 1 0 1 N?
_<:Tuple{Tuple{Symbol, Vararg{Symbol, N<:Any}}, DataType, Type{T<:Any}} Tuple{Tuple{Vararg{Any, N<:Any}}, DataType, Vararg{Any, N<:Any}} 0 0 0 1 N?
Tuple{Type{Tuple{Type{Base.IteratorsMD.CartesianIndex}, Tuple{Int64, Int64, Int64}}}, UInt64} Tuple{Type{Tuple{Type{Base.IteratorsMD.CartesianIndex}, Tuple{Vararg{Int64, N<:Any}}}}, UInt64} 0 0 0 1 N?

jl_type_morespecific:
none

type_match_:

child parent morespecific invariant old result new result best
NTuple{#N<:Any, Int} NTuple{#N<:Any, Integer} 0 0 0 1 N
Tuple{Ptr{#T<:Any}, NTuple{#N<:Any, Int64}} Tuple{Ptr{#T<:Any}, NTuple{#N<:Any, Integer}} 1 0 0 1 N

Note that in only one case doesthere are no cases where the result in current master seems to make more sense to me than the version here. I think there's some chance that, once working, this will fix some ambiguity and similar problems.

@@ -1583,11 +1582,197 @@ DLLEXPORT void jl_(void *jl_value)
in_jl_--;
}

// Useful because the jl_typeof macro isn't available from debugger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i found out today that if you type macro define jl_typeof jl_typeof then it works in gdb (it already works in lldb).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tip, thanks!

@timholy
Copy link
Member Author

timholy commented May 18, 2015

I found the exit. (The key commit was 25689843174fa2.) Whew.

Presuming this passes CI (as it does locally), this is ready for serious review. Like other core changes, this has the potential to be disruptive, so assuming it's going into 0.4 it should not be a last-minute merge.

@tkelman
Copy link
Contributor

tkelman commented May 18, 2015

Oh my. Nice work, impressive dedication. Travis appears to be failing an assertion, did you run with julia-debug locally?

julia-debug: julia.h:1019: jl_vararg_kind: Assertion `(((jl_value_t*)(((jl_taggedvalue_t *) ((char *)((lenv)) - __builtin_offsetof (jl_taggedvalue_t, value)))->type_bits&~(size_t)3))==(jl_value_t*)(jl_tvar_type))' failed.

@mschauer
Copy link
Contributor

Do you think this is feasible for 0.4 in contrast to the "type system design iteration" which is tagged 0.5?

@timholy
Copy link
Member Author

timholy commented May 18, 2015

@tkelman, lots of time with the debug version, but I didn't run the whole test suite that way. Thanks for flagging this, and of course @garrison and @vtjnash for #11155. Should be fixed now, but we'll see what CI says.

@mschauer, there's an important distinction between the two: this is done, and that one isn't 😄. This should be very useful for #10525 (the whole reason I started this, oh-so-long-ago), if the codegen bugs there can be worked out.

@timholy
Copy link
Member Author

timholy commented May 18, 2015

@mschauer, I should also say that this probably makes the "type system design iteration" easier. This gets rid of a whole class of objects that needed special treatment in our type system, and it turns out we weren't always being consistent about how that class was treated.

@mschauer
Copy link
Contributor

@timholy I was asking because I was hoping so. ;-)

@timholy
Copy link
Member Author

timholy commented May 27, 2015

To avoid overwhelming Jeff with requests, I had planned on waiting a while before bumping this. But this probably deserves at least a brief glance before making decisions about what makes it in to 0.4, and it sounds as if final decisions about 0.4 could happen soon.

The main motivation for this is #10525, and if that doesn't get merged then this is not pressing and could wait for Arraymagedon. But it has advantages of its own. Of course, it could also break stuff (and almost inevitably will, despite strenuous efforts to the contrary).

@mbauman
Copy link
Member

mbauman commented Jun 3, 2015

There's a little bit of dispatch wonkiness that I uncovered with #11547: Tuple{} isn't matched by the parametric f{N}(::Type{Tuple{N}}) but it is matched by the explicit ::Type{Tuple{0}}.

julia> f{N}(::Type{NTuple{N}}) = N
f (generic function with 1 method)

julia> f(Tuple{})
ERROR: MethodError: `f` has no method matching f(::Type{Tuple{}})

julia> f(Tuple{Int,})
1

julia> g(::Type{NTuple{0}}) = 0
g (generic function with 1 method)

julia> g(Tuple{})
0

julia> NTuple{0}
Tuple{}

julia> Tuple{} <: NTuple{0}
true

julia> isa(Tuple{}, Type{NTuple{0}})
true

(Also, I resolved a few simple conflicts with master and figured I'd push it on the off-chance that it'd save someone else the work. Or you can totally ignore it and force-push over it. Edit: whoops, looks like my master was a bit behind… there are still newer conflicts. Oh well.)

@JeffBezanson
Copy link
Member

Awesome work. I'll start looking at this seriously, since it would be best to merge it before tackling #11320.

With this, it seems more pressing to consider renaming Vararg. Vararg{T,2} isn't "var" anything; it's exactly 2! The name is already somewhat unloved, and this seems to be the final strike against it. Maybe Rep{T}, for "repeat"? Any other ideas?

@timholy
Copy link
Member Author

timholy commented Jun 3, 2015

Yes, you're right that the name is no longer appropriate. I'd vote for Repeat{T,N} over Rep{T,N}.

@mbauman, thanks for kicking the tires here, and for the start on conflict resolution. I'll see about doing a rebase, just to keep the size down (which might nix, sorry, your conflict resolution changes, since I think you did those as a merge commit).

@@ -1433,8 +1440,12 @@ Too numerous to mention.
[#10994]: https://github.com/JuliaLang/julia/issues/10994
[#11105]: https://github.com/JuliaLang/julia/issues/11105
[#11145]: https://github.com/JuliaLang/julia/issues/11145
<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops indeed. Fixed.

@timholy
Copy link
Member Author

timholy commented Jun 4, 2015

The PR was hard to review as a merge with master (because of all the extra changes), so I rebased it. BUT I had to disable two tests, see the final commit (one of these was passing when I first submitted this PR, the other of which is a "new-style" invoke test modified in #10939). I decided to post at this intermediate stage because 0.4 closure is coming soon and if someone wants to review this as-is, addressing the reviewer remarks may lead to the fix. In parallel, I will look into these two tests.

This does not yet change the name from Vararg, as that probably needs more discussion.

@mbauman
Copy link
Member

mbauman commented Jun 4, 2015

Yeah, I wish git/github did better with Merge commits. In some senses they're nice since they don't mess with history, but they're really hard to deal with. I take no exception here :).

@ScottPJones
Copy link
Contributor

👍 to getting this in, 👍 for Repeat instead of Vararg

@timholy
Copy link
Member Author

timholy commented Jun 4, 2015

@mbauman, I fixed your "wonkiness" above. It turns out this was due to the one case I listed above where I thought master made more sense than this PR. Now in all cases of disagreement, this seems more consistent (to me).

@StefanKarpinski
Copy link
Member

Usually, spelling things out – i.e. Repeat – is good, but I'd argue that the more cryptic Rep is better in this case since this is a very specific thing. I'm still not thrilled about that name, but I haven't come up with anything better.

@tkelman
Copy link
Contributor

tkelman commented Jun 4, 2015

I prefer Repeat over Rep, though it could be a bit confusing since repeat is an unrelated function. How about RepType? Too ugly?

@tkelman tkelman merged commit 280310d into master May 5, 2016
@tkelman tkelman deleted the teh/ntuple branch May 5, 2016 16:39
@mbauman
Copy link
Member

mbauman commented May 6, 2016

Method sorting question: is the precedence here intentional? It feels like it might be ambiguous… or maybe even more natural to be reversed:

julia> f{T,N}(::Array{T,N}, ::Vararg{Int, N}) = 1
f (generic function with 1 method)

julia> f(::Array, ::Int) = 2
f (generic function with 2 methods)

julia> f([1,2,3], 1)
2

julia> methods(f)
# 2 methods for generic function "f":
f(::Array{T<:Any,N<:Any}, ::Int64) at REPL[10]:1
f{T,N}(::Array{T,N}, ::Vararg{Int64,N}) at REPL[9]:1

It'd be slightly easier to define the indexing methods were the order here reversed — I'd like to reshape the array to a vector in method 2, but of course that'd hit a stack overflow if it's already a vector with the current arrangement.

@yuyichao
Copy link
Contributor

yuyichao commented May 6, 2016

I think this is a missing ambiguity since both of them can match arguments that the other one cannot.

@timholy
Copy link
Member Author

timholy commented May 7, 2016

While our ambiguity detection isn't perfect now, it should mean "two non-identical signatures, neither of which is consistently more specific than the other over the domain of their intersection." Using that definition, this is not an ambiguous case: the intersection of these signatures is equivalent to Tuple{Vector, Int}, and in that case the Vararg version is uniformly more specific because Vector is more specific than Array.

When passed types that do not fall in this intersection, only one of the two methods applies, and therefore such cases cannot be counted as ambiguity.

Fixed in #16249.

@yuyichao
Copy link
Contributor

yuyichao commented May 7, 2016

What does it mean by more specific than the other over the domain of their intersection if both of them can match (edit: if both of them can match all of the types in this intersection)? In particular why is ::Int not more specific than ::Vararg{Int,N}?

@timholy
Copy link
Member Author

timholy commented May 7, 2016

Tuple{Int} is more specific than Tuple{Vararg{Int, N}}. But Tuple{Vararg{Int,1}} gets expanded to Tuple{Int}, so they must be equivalent in all matters regarding type intersection and specificity.

With that background, consider a set of arguments to which both methods apply (the only condition in which you have to worry about ambiguity and specificity): it requires that N=1 because otherwise the non-varargs method wouldn't match. Since the same N is bound to the array dimension, that also makes the first argument into Vector. So in the end you're comparing Tuple{Array, Int} vs Tuple{Vector, Vararg{Int,1}}, and since the latter is equivalent in every way to Tuple{Vector, Int}, it's more specific.

@yuyichao
Copy link
Contributor

yuyichao commented May 7, 2016

Does it mean that

f{T,N}(::Array{T,N}, ::Int)
f{T,N}(::Array{T,N}, ::Vararg{Int,N})

Should be processed differently?

and therefore

f{T,N}(::Array{T,N}, ::Int)
f(::Array, ::Int)

Are non-identical signatures even though they matches exactly the same set of types?

@yuyichao
Copy link
Contributor

yuyichao commented May 7, 2016

And following that argument should this not be ambiguous?

julia> f{T}(::T, ::Int) = 1
f (generic function with 1 method)

julia> f(::Int, ::Any) = 2
f (generic function with 2 methods)

julia> f(1, 1)
ERROR: MethodError: f(::Int64, ::Int64) is ambiguous. Candidates:
  svec(Tuple{#f,Int64,Int64},svec(),f(::Int64, ::Any) at REPL[2]:1)
  svec(Tuple{#f,Int64,Int64},svec(Int64),f{T}(::T, ::Int64) at REPL[1]:1)
 in eval(::Module, ::Any) at ./boot.jl:230

@yuyichao
Copy link
Contributor

yuyichao commented May 7, 2016

And the following are also along the same line (this is on #16249). Following the argument above, 2 and 3 should return 1 and 4 and 5 should be ambiguous.

julia> type A{T}
       end

julia> f2{T}(::A{T}, ::A{T}) = 1
f2 (generic function with 1 method)

julia> f2(::A, ::A{Int}) = 2
f2 (generic function with 2 methods)

julia> f2(A{Int}(), A{Int}())
2

julia> f3{T}(::T, ::T) = 1
f3 (generic function with 1 method)

julia> f3(::Any, ::Int) = 2
f3 (generic function with 2 methods)

julia> f3(1, 1)
2

julia> f4{T}(::A{T}, ::A{T}) = 1
f4 (generic function with 1 method)

julia> f4{T}(::A{T}, ::A{Int}) = 2
f4 (generic function with 2 methods)

julia> f4(A{Int}(), A{Int}())
2

julia> f5{T}(::T, ::T) = 1
f5 (generic function with 1 method)

julia> f5{T}(::T, ::Int) = 2
f5 (generic function with 2 methods)

julia> f5(1, 1)
2

Although IMHO all of them should be ambiguous (which I personally think is more consistent since the specificity of a signature is then purely determined by the set of types it matches) but I'm not 100% sure which approach is more intuitive/easier to implement.

@timholy
Copy link
Member Author

timholy commented May 7, 2016

#11242 (comment): I'm not quite sure I understand the point you're trying to make in the Vararg case. The Vararg-free example is clear; I think one can argue about whether it should be ambiguous, but is not treated as such currently (there's a specificity bonus for binding the Varargs, even in the absence of "triangular" constraints).

#11242 (comment), given that we give a bonus for binding TypeVars in other places, I agree that should not be ambiguous. That seems like a bug.

#11242 (comment): more great examples.

Personally, I think there's only one conclusion one can draw: jl_type_morespecific needs to use an env parameter. Let's see what @JeffBezanson thinks about that.

@KristofferC
Copy link
Member

Trying this out a bit today and just wanted to say that this stuff is amazing! Thanks for all the work making this get merged @timholy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.