RFC: controlling dispatch with varargs of defined length #10691

timholy · 2015-03-31T13:21:14Z

This implements the syntax x...N to specify a varargs argument of length N. The purpose is to be able to control dispatch to functions like

function getindex{T,N}(A::AbstractArray{T,N}, indexes...N)
...
end

so that this method only gets called when the number of indexes matches the dimensionality of the array. This is motivated by #10525.

Incidentally, this adds a number of comments to "core" code, particularly inference.jl.

JeffBezanson · 2015-03-31T14:25:23Z

src/jltypes.c

@@ -471,6 +472,37 @@ static jl_value_t *intersect_tuple(jl_tuple_t *a, jl_tuple_t *b,
        }
        jl_tupleset(tc, ci, ce);
    }
+    // Check for a length-constrained vararg
+    if (bseq) {
+        if (!jl_is_long(bn))


Need { } here.

JeffBezanson · 2015-03-31T14:50:39Z

Excellent PR! This would be cool to have, as it basically generalizes Tuple and NTuple into a single thing. The implementation seems pretty good.

Unfortunately this conflicts to an extreme degree with #10380. There I removed the Vararg type. So far that simplifies a lot of code, since there is less wrapping and unwrapping. However this PR seems to argue for going back on that. Tuple{Vararg{T,N}} is a pretty natural way to represent NTuple{N,T}; the latter could become an alias for the former. It also generalizes to putting Vararg in places other than the last element, in case subtyping was not intractable enough already (we don't need to go there yet).

mbauman · 2015-03-31T17:43:32Z

Very cool. I've not looked in depth yet, but does this mean that we'd be able to define things like:

getindex{M, N}(A::AbstractArray, I::Integer...M, index::CartesianIndex{N})

which would allow us to generalize cartesian indexing over "inner" regions?

JeffBezanson · 2015-03-31T17:52:30Z

No; I meant to imply that the generalization to allowing ... anywhere is future work, and quite difficult too.

StefanKarpinski · 2015-03-31T17:54:50Z

Least important part, but I'm not entirely sold on the syntax.

JeffBezanson · 2015-03-31T18:20:20Z

I recall a suggestion that we use Tuple{T, Vararg{S}} as the syntax for varargs tuples, and it's not a terrible idea. Definitely removes confusion about what ... means. For methods we'd have f(x, xs::Vararg{T, N}) which is not a bad default choice.

But we have to be careful about the following case. If I have

f{T}(::Type{T}, t::Tuple{T})

I should not be able to write f(Vararg{Int}, (1,2,3,4)) (indeed currently this correctly gives a no method error).

StefanKarpinski · 2015-03-31T19:19:44Z

It seems like this should be spelled VarArg to be consistent with Julia capitalization policy – i.e. uppercase the first letter of each word, even when it's abbreviated. There was some other example recently that I couldn't muster the energy to kvetch about. @JeffBezanson, you seem to have a tendency to not capitalize the first letter of things that are multiple words abbreviated, which has led to some inconsistency in naming.

JeffBezanson · 2015-03-31T19:37:32Z

Where? Let's just look at the names in question, rather than maligning my naming habits in general. Everything is open to kvetching, but if one does not even kvetch, not much can be done.

I think this case is debatable. I would argue that "vararg" has become a single programming term. Googling it gives many hits, even where it is spelled "Vararg".

Fixes a stack overflow in parsing atsign-`sprintf "%f %d %d %f" 1.0 [3 4]... 5`

timholy · 2015-03-31T21:23:54Z

@JeffBezanson, sorry about the conflict with your awesome tuple work. I'm open to changes here if it simplifies getting both of these together.

I pushed a small commit addressing your review comments, and also had to prevent parsing matches if there is a space between ... and the length parameter. I was getting a stack overflow on parsing one of the string tests, specifically @sprintf "%f %d %d %f" 1.0 [3 4]... 5.

For me locally this passes all tests, and if that's also true for our CI then I suppose the question turns to "should we merge this now, or later?" So,

How serious is the proposal to change from x::T...N to x::Vararg{T,N}; in particular, is this something we want to do as part of this PR before merger? To be honest, I am skeptical that I am the best person to introduce a syntax deprecation mechanism into the scheme code, since my scheme skills are rather embryonic.
(Minor) Do we have a mechanism to automatically trigger a make clean upon merger of specific PRs? I noticed that building failed with just make because of the new parameter in Vararg. By now, I know most people are used to trying make clean && make if just plain make fails (indeed, we tell them to try it), but since I noticed the problem I thought I'd ask.

JeffBezanson · 2015-03-31T21:51:25Z

If we do this, it's probably my branch that needs to change. Ideally we can keep Tuple and Vararg, and move NTuple out of the core. In my redesign, there is only a boolean flag and no room for an extra parameter (was hoping to save a few objects that way, but oh well).

Another possible syntax is T^^N; "R↑N" is sometimes used for regular expressions with repetition counts. I'd like to get the syntax settled so I can work from that on the tupleoverhaul branch; otherwise there will be too many balls in the air.

We should consider how this maps to machine data types. NTuple would clearly use the array ABI. Tuple{Int, NTuple{n,Int}} would be a struct with an Int member and an array member. But I'm not sure what Tuple{Int, Vararg{Int,n}} would be. Any thoughts @Keno @vtjnash ?

Maybe there are some source files that should trigger a make clean; perhaps julia-parser/julia-syntax, or maybe julia.h.

timholy · 2015-03-31T22:05:01Z

Another possible syntax is T^^N; "R↑N" is sometimes used for regular expressions with repetition counts. I'd like to get the syntax settled so I can work from that on the tupleoverhaul branch; otherwise there will be too many balls in the air.

I'm fine with that; I just went with x...N because it is so close to what we currently do. I don't generally think syntax is my strong point, so odds are good that I'll be happy with whatever you and Stefan hammer out. Once y'all make a decision, we can decide whether that gets implemented here or whether we merge this and slate the syntax for changing in your branch.

Tuple{Int, NTuple{n,Int}} would be a struct with an Int member and an array member. But I'm not sure what Tuple{Int, Vararg{Int,n}} would be.

I don't (yet) see why it wouldn't be the same. (Is n bound?)

Maybe there are some source files that should trigger a make clean; perhaps julia-parser/julia-syntax, or maybe julia.h.

Those seem like reasonable choices. I suppose we could even have a file (called makeclean?) that we just touch. But that would rely on the developer noticing, so it's probably better to go with your suggestion.

StefanKarpinski · 2015-03-31T23:17:45Z

@JeffBezanson, sorry, I didn't mean to malign but to raise awareness so that next time you're naming something you may pause and consider if you're doing that (after raising your fist in the air and saying "dammit, Stefan"). It's entirely possible I'm imagining such a tendency as well. I do agree that "varargs" is a borderline case and one could argue that "vararg" is a word at this point. I'm pretty sure that Google ignores capitalization so I doubt that test really indicates much about which form is more popular.

StefanKarpinski · 2015-03-31T23:19:40Z

I'd like to advocate for introducing this functionality without any convenient syntax. If we find ourselves using it a lot, then we can pick a surface syntax that is just a nicer way of writing VarArg{T,N}. I wonder if it's also worth considering lower bounds on the number of varargs or does that just complicated things unnecessarily – obviously you can write out arguments and then use them.

timholy · 2015-03-31T23:29:00Z

I'll look into that; clearly

julia> f(x::Vararg{Int}) = @show x
f (generic function with 1 method)

julia> f(1,2,3)

signal (11): Segmentation fault
...

merits investigation first. (I'm on my branch, but I'm almost certain this happens on master, too.)

vtjnash · 2015-04-01T00:26:40Z

i don't understand the motivating example for this. can you sketch a more complete example of where this is needed for method dispatch?

timholy · 2015-04-01T01:30:36Z

@vtjnash, the bigger picture on #10525 (one of my favorite PRs in recent memory) is to centralize all the "hard" stuff in indexing, and allow users to write new AbstractArray types and handle just the simplest indexing operations. As one example, julia allows one (and users find it convenient) to index a 3-dimensional array with 1, 2, 3, or 4 indexes (the 4th has to be 1). One can also index them with Ranges, Vectors, boolean arrays, etc. There are many outstanding bugs/complaints about AbstractArray types that don't implement the full range of indexing operations. As a case in point, until I reworked SubArrays for 0.4, they didn't handle indexing with the "wrong" number of indexes; likewise, ArrayViews (a justifiably well-regarded AbstractArray class) doesn't handle many core indexing methods like a[1:3, 2:4]. Stefan recently noticed that even some pretty basic operations are not handled by Arrays (#10618).

The central idea in #10525 is to massage whatever indexes the user provides into a form that makes it easy for the package author. Let's say you have an AbstractArray class for which linear indexing is slow but cartesian indexing is fast (e.g., SubArrays, InterpolationArrays, etc). So indexing a 3d array with 1,2,3, or 4 scalar indexes has to turn into fast variants of

A[i] = A.data[ind2sub(size(A), i)...]
A[i,j] = A.data[i, ind2sub(size(A)[2:3], j)...]
A[i,j,k] = A.data[i,j,k]
A[i,j,k,l] = l == 1 ? A.data[i,j,k] : BoundsError()

More complex contructs are needed for AbstractVector indexes, logical indexes, etc. The idea in #10525 is to handle all the ind2sub stuff in the generic indexing, and make sure that the user only needs to write that 3d case.

If the user writes a method

getindex{T,N}(A::MyArray{T,N}, indexes::Int...)

then she is only intending to handle the scalar case, and rely on the centralized code to handle the loops needed for AbstractVector indexing. However, she has just signed herself up to manually handle all the "wrong" dimensionality cases as well, because this method will take precedence over any method written for an AbstractArray that takes scalar indexes. The goal of this PR is to make it possible to restrict the method to just N indexes.

JeffBezanson · 2015-04-01T02:06:37Z

It may be time to consider having indexing take a single tuple argument instead of varargs. After the upcoming change tuples will probably be efficient enough.

vtjnash · 2015-04-01T02:10:02Z

@timholy thanks. that's a good summary. i was confused initially and thought you were trying to ensure the user could cause no-method exceptions (based on the tests)

We should consider how this maps to machine data types. NTuple would clearly use the array ABI. Tuple{Int, NTuple{n,Int}} would be a struct with an Int member and an array member. But I'm not sure what Tuple{Int, Vararg{Int,n}} would be. Any thoughts @Keno @vtjnash ?

the representation of NTuple is currently undefined, but I usually assume it should be the same as the a Tuple of the specified number of arguments

Tuple{Vararg{T,N}} is a pretty natural way to represent NTuple{N,T}; the latter could become an alias for the former

Given the above statement, it seems that one of Vararg{T,N} and NTuple{N,T} should be deprecated. since Vararg is a term from C, I recommend leaving it free to use for C-compatibility (#6661)

Tuple{Int, NTuple{n,Int}}

This seems like it would be stored as a flat tuple with n+1 Int members in-line in memory

calling convention

There's also the question of what happens when this gets passed to a function. Most efficient, would probably be to pass this as a single tuple/vector (same as if the user had passed a single literal Tuple).

timholy · 2015-04-01T14:37:03Z

OK, I figured out why declarations like f(x::Vararg{Any}) were causing segfaults, and just appended a fix to the end of this PR. (Let me know if my varargexpr? function should be done some other way...) No matter what we decide about other syntax, this seems worth having.

I think all that's left is to make a final decision about the syntax (or drop it altogether, as suggested by Stefan).

StefanKarpinski · 2015-04-01T14:40:47Z

To immediately go against my own plea for ignoring syntax at this point, we could use the kind of notation that's been bandied about in relation to #10380, we could have the syntax {Int^N} for the type of an N-tuple of Ints and {Int^(M:N)} for a tuple of at least M and at most N Ints, in particular writing {Int^(0:N)} for what this PR writes as Int...N. This would entail special interpretation of ^ and : inside of {...} but I don't think that's going to be a big conflict with the normal meaning.

timholy · 2015-04-01T15:20:15Z

To clarify, this PR equates Int...N with {Int^N}, not {Int^(0:N)}. (It's "exactly" N integers, not "at most" N integers.)

Otherwise, though, there is a certain appeal to your proposed syntax. It will be interesting to hear how it competes with other possible uses of {} (although perhaps because it follows :: it's not really in competition).

JeffBezanson · 2015-04-01T17:12:31Z

It won't always follow ::, so there definitely is competition for the syntax.

Seems to me even the A[i::Int, I::CartesianIndex{N-1}] case could be handled by tuples instead of trailing varargs. We also might very well need to implement elementwise +, -, etc. for tuples, to use them as SIMD types.

timholy · 2015-04-01T17:33:34Z

Yes, as long as one can pack and unpack tuples "for free" then I agree they make a nice interface---(i,I) could be packed into J and passed to the user. I started down this road because I was assuming that "fully unpacked" was the least ambiguous interface, but I agree that tuples are worthy of careful consideration (see #10525 (comment)).

If one goes with tuples, the only downsides I see:

backward compatibility probably needs some care? The "core indexing policy" code would just be handing user code a tuple, without checking that it's OK. I would guess that currently most AbstractArray types have indexing methods defined that will take precedence over RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525, so I suspect this won't be a problem in all but (at most) a few cases.
Dicts, for example, can be indexed by a tuple object. I assume/hope no one currently has an AbstractArray type that's already using tuples for indexing operations, especially with a tuple in a single "slot."

timholy · 2015-04-01T17:34:14Z

Should I just strip the syntax and merge a version of this that uses f(x::Vararg{T,N})? Syntax can be added later.

JeffBezanson · 2015-04-01T17:55:42Z

Dicts are an interesting comparison because by and large everybody thinks indexing a Dict with multiple indexes can only be equivalent to indexing it with a single tuple. Switching arrays to tuple indexing would make the interfaces match; only difference would be arrays are optimized for dense storage.

This bears careful thinking. I've always felt there is something weird and annoying about varargs. They have this non-first-class feel. For example consider the slightly odd argument order in setindex!, which was caused by varargs. It seems better to say there are 3 things involved: the array, the index, and a value. We should look hard at using more "nested" argument lists.

StefanKarpinski · 2015-04-01T18:06:35Z

I've always felt that it would be nicer to do getindex(a, k) and setindex!(a, k, v). IIRC, the reason we didn't use a tuple for the key for multidimensional indexing was just that at the time tuples couldn't be eliminated effectively enough.

JeffBezanson · 2015-04-01T18:18:59Z

Yes I have the same recollection.

StefanKarpinski · 2015-04-01T18:28:36Z

This is going to be tricky to deprecate :-\

timholy · 2015-04-01T18:33:46Z

All this resonates with me, too. But it clearly bears very careful consideration.

Aside from the challenging deprecation (i.e., focusing on whether we want to do this), one minor annoyance might be for people who create AbstractArray types of fixed dimensionality, e.g.

immutable ToeplitzMatrix{T} <: AbstractMatrix{T}
    offdiags::Vector{T}
    midpoint::Int
end

getindex(A::ToeplitzMatrix, i, j) = A.offdiags[i-j+A.midpoint]

The latter would presumably have to become

getindex(A::ToeplitzMatrix, I) = A.offdiags[I[1]-I[2]+A.midpoint]

which is not quite as pretty on the right hand side, but is something I can imagine everyone getting used to.

StefanKarpinski · 2015-04-01T19:08:08Z

Argument destructuring would help a lot since you could then write this:

getindex(A::ToeplitzMatrix, (i, j)) = A.offdiags[i-j+A.midpoint]

JeffBezanson · 2015-04-01T19:12:10Z

Excellent point, and there's a feature request open for that already.

Jutho · 2015-04-01T19:53:45Z

+1 to merging this functionality into 0.4 and I also support the proposals / suggestions regarding array indexing etc.

JeffBezanson · 2015-04-01T20:00:10Z

We don't necessarily need both.

timholy · 2015-04-01T20:08:34Z

If the tuple work will be completed and the window for 0.4 left open a bit longer, then I'd be in favor of exploring the tuple proposal for handling array indexing. If that doesn't end up looking promising, presumably we can merge something like this.

timholy · 2015-04-02T12:53:59Z

I should also ask: if we do this with tuples, will the declaration need to use triangular dispatch, i.e.,

getindex{T,N,TT<:NTuple{N,Number}}(A::MyArray{T,N}, indexes::TT) = ...

As an "interesting" case, think about the situation in which a user wants to compute the derivative along dimension 2 for an Interpolation array A,

v = A[3.2, dual(1.8,1)]  # dual from DualNumbers.jl

so the elements of the indexes tuple may not all be of the same type. But we definitely want that N constraint.

If we need triangular dispatch but it's not going to happen in 0.4, then we might need to reconsider.

Jutho · 2015-04-02T15:03:58Z

Isn't this just equivalent to getindex{T,N}(A::MyArray{T,N},indexes::NTuple{N,Number})?

timholy · 2015-04-02T15:37:26Z

Hmm, I'd temporarily forgotten tuples are covariant, unlike other types in julia. Good.

timholy · 2015-04-20T15:50:55Z

Closed in preference to #10911.

timholy added 9 commits March 31, 2015 05:21

Parse x::T...N as Vararg{T,N}

fc53d74

Add another parameter to Vararg

a34bd77

Vararg{T,N} support in typeof_tfunc; add comments to inference.jl

b426ec0

Check vararg length in intersect_tuple

1a9cbc6

For fixed-length vararg functions, don't cache as vararg

8180be2

Set vararg length parameter if not already set

e81212e

Add Vararg{T,N} tests

9404895

Support x...3 varargs declarations

334b6dd

Add docs and NEWS on Vararg{T,N}

1fb3c91

JeffBezanson reviewed Mar 31, 2015
View reviewed changes

Use extend rather than ad-hoc solution in setting function parameters

bb5c75a

Vararg: disallow a space in parsing x...N between ... and N.

bc7b000

Fixes a stack overflow in parsing atsign-`sprintf "%f %d %d %f" 1.0 [3 4]... 5`

This was referenced Apr 7, 2015

RFC: Give AbstractArrays smart and performant indexing behaviors for free #10525

Merged

vararg tuple types not working when vararg not at last place #10770

Closed

JeffBezanson mentioned this pull request Apr 9, 2015

WIP: redesign of tuples and tuple types #10380

Merged

timholy mentioned this pull request Apr 20, 2015

WIP: controlling dispatch with varargs of defined length (rebased) #10911

Closed

timholy closed this Apr 20, 2015

timholy mentioned this pull request Apr 24, 2015

Type stable block #10980

Open

timholy mentioned this pull request May 12, 2015

NTuples made me sad (so I nixed them) #11242

Merged

timholy mentioned this pull request Jun 2, 2015

Fix Unicode bugs with UTF-16/UTF-32 conversions (#10959) #11004

Closed

timholy mentioned this pull request Jun 10, 2015

Foo.Bar.baz<tab> – deep tab completion #10091

Closed

DilumAluthge deleted the teh/valen branch March 25, 2021 22:12

RFC: controlling dispatch with varargs of defined length #10691

RFC: controlling dispatch with varargs of defined length #10691

Conversation

timholy commented Mar 31, 2015

JeffBezanson Mar 31, 2015

Choose a reason for hiding this comment

JeffBezanson commented Mar 31, 2015

mbauman commented Mar 31, 2015

JeffBezanson commented Mar 31, 2015

StefanKarpinski commented Mar 31, 2015

JeffBezanson commented Mar 31, 2015

StefanKarpinski commented Mar 31, 2015

JeffBezanson commented Mar 31, 2015

timholy commented Mar 31, 2015

JeffBezanson commented Mar 31, 2015

timholy commented Mar 31, 2015

StefanKarpinski commented Mar 31, 2015

StefanKarpinski commented Mar 31, 2015

timholy commented Mar 31, 2015

vtjnash commented Apr 1, 2015

timholy commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015 via email

vtjnash commented Apr 1, 2015

timholy commented Apr 1, 2015

StefanKarpinski commented Apr 1, 2015

timholy commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015

timholy commented Apr 1, 2015

timholy commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015

StefanKarpinski commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015

StefanKarpinski commented Apr 1, 2015

timholy commented Apr 1, 2015

StefanKarpinski commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015

Jutho commented Apr 1, 2015

JeffBezanson commented Apr 1, 2015

timholy commented Apr 1, 2015

timholy commented Apr 2, 2015

Jutho commented Apr 2, 2015

timholy commented Apr 2, 2015

timholy commented Apr 20, 2015