Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a, b... = [1,2,3] #2626

Closed
StefanKarpinski opened this issue Mar 20, 2013 · 61 comments · Fixed by #37410
Closed

a, b... = [1,2,3] #2626

StefanKarpinski opened this issue Mar 20, 2013 · 61 comments · Fixed by #37410
Assignees
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) design Design of APIs or of the language itself

Comments

@StefanKarpinski
Copy link
Member

This would be a very nice syntax for taking head and rest. Likewise a..., b = [1,2,3] might be good for slurping the initial elements into a and the tail element into b.

@JeffBezanson
Copy link
Member

Should we add tail(itr, state) to the iteration protocol, giving a collection with elements starting at the given state?

@StefanKarpinski
Copy link
Member Author

In some cases that would be easy but it won't always. Having a Rest{T,S}(itr::T,state::S) type that wraps an iterator with a state and allows you to iterate the rest of it might do the trick.

@JeffBezanson
Copy link
Member

Ah, of course, the Drop iterator already does something very similar to this.

@StefanKarpinski
Copy link
Member Author

Rest might be a better name for that iterator.

@JeffBezanson
Copy link
Member

Well, they are actually a bit different. Drop takes a count of items to skip. Rest would start from a given state, making it basically trivial to implement.

@StefanKarpinski
Copy link
Member Author

Ah, that's true, but the Rest type can server both purposes, it's just a matter of how you get there – by taking an explicit state or a number of values to skip over.

@ViralBShah
Copy link
Member

Wouldn't it be possible to simulate matlab-style varargout with this? I guess even though possible, it should be avoided since it would play havoc with the type system.

@JeffBezanson
Copy link
Member

No, this does not tell the function how many outputs are requested. And actually, in a,b = f(), as long as f returns 2 or more values, this will work and just drop the rest.

@vtjnash
Copy link
Member

vtjnash commented Mar 21, 2013

But it does result in telling the iterator how many arguments are necessary, so it seems you could write the function as a continuation iterator to weakly simulate matlab's varargout (the sane version where you just do lazy computation)

@StefanKarpinski
Copy link
Member Author

We could generally support Matlab's varargout if we did it lazily and changed the protocol for destructuring a little bit. The idea came up in the discussion of factorization objects. I.e. if a,b = x caused a single destructuring call to x to occur, giving some kind of boolean mask of which values should be produced. Then again, I'm not sure we really want fully general varargout since it's a bit weird that the outputs can change completely depending on how many of them are asked for.

@mschauer
Copy link
Contributor

Before implementing this maybe it would be good to check whether a more general approach to destructuring assignments is welcomed. Basically, every function f with a fixed bijective inverse function inv(f) can be used to write

f(a, b) = c
-> a, b = inv(f)(c)

examples for f are tupel composition and list composition from head and tail
f(a,b) = (a,b)
but also say
(sign(x), abs(x))
and
((a,c), (b,c))
have such inverses, which can be found by applying the rule
inv(f*g) = inf(g)*inv(f)
or be provided explicitly.

@JeffBezanson
Copy link
Member

Yes, you can do that, but it's not a new capability added by a,b... = x syntax. You can do it already. Just have the function return an iterator that computes values as next is called. Then a,b = f() will compute 2 values, a,b,c = f() will compute 3, etc., with no ... needed.
The real problem is the case of a single result, a = f(x), which just assigns the whole thing and no destructuring happens. If that breaks, composing functions starts to get difficult.

@StefanKarpinski
Copy link
Member Author

I found myself wanting this syntax yet again today. I think we should consider this.

@kmsquire
Copy link
Member

(Match.jl has this)

@StefanKarpinski StefanKarpinski added the parser Language parsing and surface syntax label Aug 26, 2016
@StefanKarpinski StefanKarpinski added help wanted Indicates that a maintainer wants help on an issue or pull request design Design of APIs or of the language itself labels Nov 10, 2016
@StefanKarpinski
Copy link
Member Author

There are some design issues here, primarily: what should the type of b (or a) be – array, tuple or iterator? Since we haven't addressed this, it seems best to bump this to 1.0.

@HarrisonGrodin
Copy link
Contributor

Here's an interesting example that begins to address the issue.

julia> a, b, c = countfrom()
Base.Count{Int64}(1,1)

julia> a, b, c
(1,2,3)

This works as expected, but what if "splatting" was used on one of the variables?

In the case that c was "splatted", it would make sense to return an iterator with start returning the "current" iterator value and next/done matching that of the existing iterator.
However, what if b was "splatted"? Would the code run indefinitely? An iterator could be useful if the last variable is "splatted", but it could become more confusing otherwise.

In the case that an iterator is not the right choice, though, the question of mutability (tuple vs. array) definitely seems to be worth debating.

@StefanKarpinski
Copy link
Member Author

Yes, it does seem that returning a Rest iterator might be necessary in the general case.

@simeonschaub
Copy link
Member

Should of course be taken with a large grain of salt and I'm not saying we should just follow the majority opinion here, but I was interested in what people naturally expected this to do and did a quick survey on Slack:

Screenshot from 2020-09-11 22-34-08

I was especially surprised that so many people considered returning a tuple for arrays the most useful of all the options, since I would have imagined that returning a vector would be generally preferred. As discussed on the triage call, throwing an error if the rhs isn't a tuple until we have made up our minds about all the other cases might also be a very viable option.

Regarding other languages, I found rust has something a bit like this, but as part of their more general match syntax. They only support slurping for arrays (no tuples, at least for now) with [a, b @ ..] => .... b is then a "slice", which are their type for views, but slices are immutable by default, so you need to explicitly specify mut, if b should be mutated afterwards. But since pattern matching is quite different from destructuring in Julia, I don't know whether that's really comparable.

Since we currently disallow vector expressions on the lhs of assignments, a more speculative proposal would be to support that syntax for destructuring as well, with the difference that [a, b...] = itr always collects the rest of itr into a vector, whereas for (a, b...) = itr, b is always a tuple. That still doesn't work for infinite iterators, but to me it seems that they are quite rare in real code and I think it's reasonable to have to explicitly ask for the rest with Iterators.rest or Iterators.drop in those situations. @JeffBezanson, would be interested to hear your thoughts on that.

@JeffBezanson
Copy link
Member

Interesting. I can see the case for collecting everything to tuples because that makes it as similar as possible to varargs. But I think that option is horribly NON-useful. It's giving special syntax to the operation "take the tail of this data structure and convert it to a tuple". Why would you have syntax for that? It's very slow and type-unstable for basically every case except tuples. The comparison to varargs is not as reasonable as it seems at first, because we always need to splat out function arguments into a virtual tuple first to inspect all of their types for dispatch. And indeed, splatting large collections is slow. It's a somewhat common performance trap. So trying to be like varargs here would be intentionally copying this negative aspect of the language.

But since pattern matching is quite different from destructuring in Julia, I don't know whether that's really comparable.

I think it's nearly the same thing. Of course rust has different concerns about mutability that make it hard to copy directly though.

@simeonschaub
Copy link
Member

Yes, given that, I think probably the best way forward here would be to go with E, i.e. throw an error for anything that's not a tuple, for 1.6, since returning a tuple here should be pretty uncontroversial and probably also the most common case people want to use this syntax for. That would enable us to revisit the other cases later on, once people have already used this syntax a bit, so perhaps we can make a better informed decision then.
The only question that would then remain would be what to lower this syntax to. We could add a method to Base.tail that also accepts an index to consume from, but perhaps a separate function that potentially also accepts an iteration state would be more future-proof and extensible and allow for clearer error messages. Base.rest may be too confusing, since we already have Iterators.rest and this would probably have a different API, right now I called it Base.slurp_rest, but I am open to suggestions for better names/APIs.

@cwindolf
Copy link

In case they're of any interest, here are some emails discussing these questions on a Python developer mailing list in 2007. Not sure how useful the Python perspective is, but thought they were interesting.

@tpapp
Copy link
Contributor

tpapp commented Sep 29, 2020

In order to pin down the semantics, it would be interesting to see what concrete semantics people want this to replace.

Eg I could imagine

a, b... = c

replacing

a, b = first(c), c[(begin+1:end)]

but also variations with view, dropping/keeping generalized indexing for b (eg if c::OffsetVector), etc.

It is not clear that any of these is preferable to the other. Because of this, I think that just using an explicit construct on the RHS is a reasonable alternative.

@simeonschaub
Copy link
Member

Yes, I agree that finding a semantic that works well for arbitrary array types and iterators is hard, but I think it would be a real shame to give up on this nice syntax altogether. JuliaDiff/ChainRulesCore.jl#128 (comment) is just one example where this would be really useful if it worked at least for tuples. If we only allowed this syntax for tuples for now, I don't see how this would be problematic semantically.

@tpapp
Copy link
Contributor

tpapp commented Sep 29, 2020

Restricting to tuples would be somewhat confusing, as the a, b, c = rhs syntax works for all iterables.

If the user really wants tuples, why not just

f(t) = first(t), Base.tail(t) # please someone invent a snappy name for f
a, b = f(t)

@simeonschaub
Copy link
Member

Restricting to tuples would be somewhat confusing, as the a, b, c = rhs syntax works for all iterables.

It wouldn't be a syntax error, it will just error because the analog of Base.tail is not defined for arbitrary iterables, which seems reasonable to me, since the latter also only works for tuples.

If the user really wants tuples, why not just

f(t) = first(t), Base.tail(t) # please someone invent a snappy name for f
a, b = f(t)

Sure, but you could make exactly the same argument against pretty much any syntax feature. I think what a, b... = t does should be immediately obvious to anyone familiar with how splatting and slurping works for function calls. Especially in function signatures, like in @oxinabox's example, I just find it easier to figure out what the function is doing using the slurping syntax, than using Base.tail. In that example, this change would really make writing frules using ChainRulesCore.jl more intuitive and more consistent with rrule for people wanting to write new rules.

@tpapp
Copy link
Contributor

tpapp commented Sep 29, 2020

it will just error because the analog of Base.tail is not defined for arbitrary iterables, which seems reasonable to me, since the latter also only works for tuples

I understand that you have a specific use case in mind, but from the discussion it seems that others have a different one (ie it should work for AbstractVector) and clarifying what the user expectations are would be useful.

One great feature of the current destructuring is that it just works for anything iterable, loosely coupling syntax and types via the iteration interface.

Introducing a, b... = c requires taking a stand on how c maps to b. Eg

  1. Saying that only c::Tuple is allowed and b = Base.tail(c) is one option, it plays well with types but happens to be restrictive, especially with the original proposal in mind.

  2. Making b equivalent to collect(c)[2:end] is another option, but it isn't nice for users who want to destructure tuples.

  3. Asking that the invariant (a, b...) == (c...,) (or similar) is maintained and allowing b to be any iterable for which this holds is also an option, which could accomodate tuples and anything iterable. Perhaps this could be done with lowering this syntax to a function that users can define methods for (basically f above).

@rfourquet
Copy link
Member

Perhaps this could be done with lowering this syntax to a function that users can define methods for (basically f above).

A good candidate might be peel, an overloadable/non-lazy equivalent of Iterators.peel.

@simeonschaub
Copy link
Member

A good candidate might be peel, an overloadable/non-lazy equivalent of Iterators.peel.

peel is probably not the best API here, since we don't always want to take just one element from the front. I think to be most friendly to constant propagation, this function should probably accept an iteration state as well as the number of elements in front already consumed, similar to how iterate_and_index works. I basically implemented this in #37410 as slurp_rest, just with the exception that it only ever produces tuples.

@jlumpe
Copy link
Contributor

jlumpe commented Oct 1, 2020

Would it make sense to introduce this functionality as a @slurp macro in the next version and wait and see how it is received before adding the new syntax? This could generate a lot of useful feedback from users regarding the most sensible semantics before making it officially part of the syntax, which would be much more difficult to change/deprecate later.

@tpapp
Copy link
Contributor

tpapp commented Oct 1, 2020

Yes, and in a package.

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Oct 1, 2020
JeffBezanson pushed a commit that referenced this issue Oct 26, 2020
@simeonschaub simeonschaub added compiler:lowering Syntax lowering (compiler front end, 2nd stage) and removed help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax speculative Whether the change will be implemented is speculative triage This should be discussed on a triage call labels Nov 12, 2020
@dsantiago
Copy link

Slurp in the middle would be great too:

a, b..., c = 1:5

@loprea91
Copy link

Slurp in the middle would be great too:

a, b..., c = 1:5

Yes please, equivalent to python

first, *_, last = fun()

@simeonschaub
Copy link
Member

This should now work on Julia nightly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) design Design of APIs or of the language itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.