-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array/Tuple decomposition #132
Comments
Right now tuple/array unpacking is very simple:
I mean it: the compiler just transforms the source code before applying any semantic analysis to it. We could add support for
Here we already have two small problems: first, this will work out of the box with arrays but not with tuples. For tuples we would need to understand negative indexes. Then, we also need to support ranges known at compile time (like But I wonder how useful is this. I personally never used this in code I wrote. The thing is, dynamic languages abuse arrays and tuples. For example this (one of your examples): ary = [1, [2, 3], 4]
a, (b, c), d = ary In Ruby it's just fine. What happens in Crystal? First, ary = [1, [2, 3], 4]
tmp = ary
a = ary[0]
b = ary[1][0]
c = ary[1][1]
d = ary[2] That, out of the box, doesn't compile, because Another thing: you talk about the head and tail of a list. In functional languages lists are represented as immutable linked lists, so getting the head and the tail is something very cheap and fast. In Ruby arrays can be shared, so when you slice them you get a shared view until you modify the copy (if I'm not wrong). That can also be fast and cheap. In Crystal getting the tail of a list is allocating memory for a new array and copying the data (I don't think we'll want to complicate things with shared arrays). Then, in functional languages that's the only thing you can do with a list: access the head and the tail. So that's why it's useful: because you have no other way. I believe arrays of arrays can be common, but are mostly used to hack something quickly, or used in some benchmarks or short code to demonstrate how concise or cute a language is. This is all fine with dynamic languages, but in static languages it's harder. About Erlang... sorry, I don't like Erlang at all, so here comes a rant. In Erlang the only real types you have are lists and tuples. Yes, you have records, but they are just tuples in the end, so much that every time you have to deal with one you have to tell Erlang so (X#yes_im_this_record.foo). Then you have lists. And that's all. So, of course you will define all your data to be tuples of lists of tuples of lists of tuples. So, of course, to deal with these data structures they have pattern matching, where you can select nested elements with a "short" syntax. Want to do an http call? Go ahead, we return you a tuple where the second element is a tuple whose first element is a tuple. Then you can pattern match that. Where do you get the error if you wrote something wrong? At runtime. How do you just get a specific element of that tuple if you don't care about the rest? You put ugly underscores. How someone reading your code understands what each element of that tuple is? You hope that the programmer put meaningful names to the variables, because otherwise you'll have to check the docs. In languages where you have named types, you would have an HTTPResponse class with a method status, body, etc. No tuple unpacking is needed. The code is clearer. If you make a mistake you get a compile-time error (if it's compiled!). With named types I see very little use for tuples/array unpacking. Tuples are good for returning multiple values at once. For example: struct Number
def divmod(number)
{self / number, self % number}
end
end Then unpacking is good to unpack the return value without needing an intermediate variable: div, mod = 10.divmod(3) Unpacking is also good with, for example, a, b = "foo,bar".split "," But that's it. For me, more complicated uses deserve a named type because otherwise the code becomes really cryptic. Here finishes the rant about Erlang. Note that I do like Erlang because of it's efficiency and philosophy. I just don't like the syntax and the very limited types you have. Elixir is an improvement, but I really don't like to just prefix every method with the type I'm working with ( All of the above is just my personal opinion. I know @waj likes Erlang, lists, tuples and pattern matching more than me. @bcardiff also likes functional languages and this kind of stuff a lot, so they might like more your point of view. We can leave this ticket open for further discussion... who knows what will happen in the end? :-) |
That would be new to me, I thought the main reason for introducing Enumerator::Lazy1 would be reducing the amount of (large) arrays that are allocated. Talking about Enumerator, more complex chains here are my main usage for advanced decomposition, that is, in block arguments: hash.group_by {|k, v| v.foo }.each.with_index do |(k, v), i|
...
end I very much like the functional powers Ruby provides here, they are very useful for small, data transforming scripts. If your data structures evolve in something too complex you then can just start to introduce entity classes and the like. This together allows evolving from fast prototypes to more complex applications that don't need to be designed upfront. So yes, I like those nested data structures and being able to decompose them is essential to work with them. Not letting them get too deep or wide is a question of the programmers discipline, not of the language, in my opinion. Looking at the Enumerator example, supporting that advanced decomposition just for tuples (at first), would already provide a useful addition. PS: I never really got the hang of Erlang either, it just happened to be another example I knew ;P |
@MrZyx hey, I always wondered how to iterate a hash while having the index! That looks nice. Yes, nested tuple unpacking is something that could work, because in each position you might have a different type and the compiler is aware of that (the same doesn't apply for arrays). Right now Hash is not Enumerable. I guess the enumerated type would be Tuple(K, V), but then we would need to unpack tuples in block arguments, which is what @kostya also wants (and me too). |
ruby: crystal: I hope b has a nil value. |
If |
Yes, |
+1, but I'll qualify: I think tuple unpacking is very different from array unpacking. The former is way more intuitive, imo. It seems perfectly reasonable to error on this: [[1, 2], [3, 4]].each { |a, b| ... } But this should work: [{1, 2}, {3, 4}].each { |a, b| ... } Or maybe even better, make it explicit: [{1, 2}, {3, 4}].each { |(a, b)| ... } I have a hunch that tuple unpacking would be a bit easier to implement, also. |
I like being able to do this: { a: [1,2], b: [3,4] }.each{|k,(v1,v2)| puts "#{k} : <#{v1}:#{v2}>" } In other words, unpacking an array or tuple in a block it's great. Using temp vars or full representations seems like worse options. |
Let's revive this a bit 😃 First of all, I think this will never have any reasonable & safe way to work for things other than a I'd personally would prefer to remove the The use-case I have in mind is that sometimes we only need the first return value of a function returning 2+ return values. We can't write Note that a current workaround is to do About deep decomposition, the example from above already works for tuples: tmp = {1, {2, 3}, 4}
a = tmp[0]
b = tmp[1][0]
c = tmp[1][1]
d = tmp[2] So it would only need implementation (and no impossible-thinking about types being unknown about Array with union types, etc..) As As @asterite said, it's pretty ugly to use this syntax for mass variable initializing, but I think it can get really handy for specialized framework and DSLs! I think this tuple-only decomposition syntax using curly brackets It might be argued that Edit: a |
One layer of def destruct(car, cdar, *cddr)
{car, cdar, cddr}
end
tup = {3, 4, 5, 'a', 7}
a, b, c = destruct(*tup) # note the splat here
a # => 3
b # => 4
c # => {5, 'a', 7}
typeof(c) # => Tuple(Int32, Char, Int32) However, this only works for splats at the trailing position, because all parameters following the splat parameter must be named parameters. With a bit of hackery the same can be done using yield splats: def yield_each(values)
yield *values
end
tup = {3, 4, 5, 'a', 7}
a, b, c = yield_each(tup) { |x, *y, z| {x, y, z} }
a # => 3
b # => {4, 5, 'a'}
c # => 7
typeof(b) # => Tuple(Int32, Int32, Char) At this point it makes more sense to have splats on the LHS of multi-assign expressions, especially if that same syntactic form is supported on Ruby. But this should only be done if the splat variable is assigned the same tuple and not an Extra layers of decomposition can probably be done in the parser instead. This is how nested |
Also a naive implementation of the range indexers will not work when a splat variable is present. Consider a, b, *c, d, e = [42, 'a', ""]
# the above is equivalent to:
__temp = [42, 'a', ""]
a = __temp[0] # => 42
b = __temp[1] # => 'a'
c = __temp[2..-3] # => []
d = __temp[-2] # => 'a'
e = __temp[-1] # => "" For comparison, the same code in Ruby will try to assign all the non-splat variables before filling the splat: (0..7).each do |i|
a, b, *c, d, e = (1..i).to_a
p [a, b, c, d, e]
end
# [nil, nil, [], nil, nil]
# [1, nil, [], nil, nil]
# [1, 2, [], nil, nil]
# [1, 2, [], 3, nil]
# [1, 2, [], 3, 4]
# [1, 2, [3], 4, 5]
# [1, 2, [3, 4], 5, 6]
# [1, 2, [3, 4, 5], 6, 7] I'm not sure about the exact expansion here, but honestly I find the behaviour for < 4 elements very misleading. So I'm against making these extra variables nilable in Crystal. Note that the case for def yield_each(values)
yield *values
end
tup = {42, 'a', ""}
yield_each(tup) { |a, b, *c, d, e| } # Error: too many block arguments (given 4+, expected maximum 3) For non- |
I think checking on size is a great idea. Doing it for tuples would work too: the if will always be false so the raise will never triggers. |
I replicated Ruby's multi-assign behaviour in Crystal for runtime arrays: https://play.crystal-lang.org/#/r/af2b If we translate this to single assignments it would look like: x1, x2, ..., xm, *glob, y1, y2, ..., yn = exp
# the above is equivalent to:
__temp = exp
x1 = __temp[0]
x2 = __temp[1]
# ...
xm = __temp[m - 1]
__temp2 = __temp[m..]
__size = __temp2.size
if __size >= n
glob = __temp2[...-n]
y1 = __temp2[-n]
y2 = __temp2[-(n - 1)]
# ...
yn = __temp2[-1]
else
glob = __temp2[...-__size] # always empty
y1 = __temp2[-__size]
y2 = __temp2[-(__size - 1)]
# ...
y__size = __temp2[-1]
# the remaining `y` variables become nilable
end Another example, with manually expanded assignments for each size: https://play.crystal-lang.org/#/r/af2o In Crystal that O(n^2) expansion seems to be the only way to replicate this behaviour with assignments; more to the reason that we should just raise if The dynamic post-splat assignment behaviour makes sense in Ruby, if you consider that multi-assignment occurs inside Ruby's stack-based VM so non-nil elements are pushed one by one. Their implementation is around here and here. By the way, in Ruby pattern matching against what they call a "find pattern" also fails if the array is too small: case [1, 2, 3]
in [a, b, *c, d, e]
# match failure
in [a, *, b] # Crystal would use `*_` for this instead
a # => 1
b # => 3
end |
Nice! I don't think we should ever produce |
I understand that implementing this is complex and has many implications, this issue is mainly intended to start discussion and, if wanted to track it as a long term goal.
Crystal already supports multiple assignment. It would be great to expand this to array/tuple decomposition.
This is the most useful part, effectively obtaining the list head and tail without modifying it. At least Erlang too provides a construct for this, but I think most functional languages do. The next level would be decomposing a nested structure:
Ruby allows this for arbitrary objects that respond to
#to_ary
, allowing similar behavior, maybe#to_t
, could be another goal.A last goal would be allowing this in method definitions and block, proc and lambda arguments.
Starting with a simple implementation, maybe only allowing it for tuples for example, would already add a benefit in my opinion.
The text was updated successfully, but these errors were encountered: