Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler: add short block syntax &(..., &1) #9218

Closed

Conversation

asterite
Copy link
Member

@asterite asterite commented May 2, 2020

Note: Like before, I wanted to see how hard this was to implement. Nothing is decided yet.

Alternative to #9216

This PR lets you specify a block argument with references to block arguments. &1 is the first block argument, &2 is the second, and so on.

For example:

[1, 2, 3].each &puts(&1)

The above is the same as:

[1, 2, 3].each { |x| puts(x) }

You can use any expression after &. For example:

[1, 2, 3].map &(&1 * 2) # => [2, 4, 6]
[1, 2, 3].map &{&1, &1 * 10} # => [{1, 10}, {2, 20}, {3, 30}]
[1, 2, 3].map &[&1] # => [[1], [2], [3]]

["foo.cr", "bar.cr"].map &File.exists?(&1)

["World", "Computer"].each &puts("OK #{&1}")
# Output:
# OK World
# OK Computer

record Point, x : Int32, y : Int32
[{1, 2}, {3, 4}].map &Point.new(&1, &2) # => [Point(@x=1, @y=2), Point(@x=3, @y=4)]

Since you can use any expression, this is valid too:

# Not advised... if we really wanted we could disallow this
[1, 2, 3].each &begin
  puts &1
end

Do I like it?

I think I like it better than #9216 because:

  • &1, &2, etc. are clearly surrounded by &...
  • unless you use begin, or if we disallow it, the things you can put inside are pretty limited, so the block will end up being short and understandable
  • no need to write do/end nor { ... } so it's actually shorter than the alternative:
    ["foo.cr", "bar.cr"].map &File.exists?(&1)    # shorter, simpler
    ["foo.cr", "bar.cr"].map { File.exists?(&1) }

This is exactly the same in Elixir... except that in Elixir you can use &(...) in general to create functions, but in this PR you can only use it in a block argument.

What happens with regular block arguments?

They still work! For example:

proc = ->(x : Int32) { puts x }
[1, 2, 3].each &proc

The rule is that if there's &1, &2, etc. inside the block argument, then it gets expanded to a block. Otherwise it's just forwarded as a block argument. Because the type of the block argument needs to be a Proc, you will, in most if not all cases, get a compiler error if you try to misuse it. For example:

p [1, 2, 3].map &puts("hello")
                 ^---
Error: expected a function type, not Nil

And we can improve the error above saying "If using the short block syntax, be sure to use one of &1, &2, etc." (this PR doesn't do that yet, but it's super easy to implement).

Another alternative is to restrict the block argument to:

  • a variable
  • a proc literal (like &->File.exists?(String))
  • anything else, but it must include &1, &2, etc, otherwise it's an error

That alternative is probably better because we can give better error messages.

Implementation details

For #9216 the first idea that came to my mind for implementing it is:

  • before analyzing a call's block, check if it has &1, &2, etc
  • if so, add missing block arguments, etc.

This means that the compiler has to do this for every block out there. I'm sure there are more optimal ways to implement this: for example we could detect at parse time which blocks need expansions. But this requires a bit more tracking.

In this PR we need to do the same, but only for block arguments (&...), not for every block. And usually block arguments are just a forwarded variable, or a more complex expression which, with this PR, it's very likely to include these &1 thingies. So the implementation is much more performant.

Final thoughts

I think this solves the missing use case of short block syntaxes where one would like to put the first argument in a position other than the caller. And it also generalizes it for any number of arguments. All of this being kept surrounded by & so the code becomes easier to understand.

/cc @waj

@asterite
Copy link
Member Author

asterite commented May 2, 2020

@oprypin I'm interested to hear your opinions on this alternative 🙏

@oprypin
Copy link
Member

oprypin commented May 2, 2020

Seems fine, thanks.
.each &puts(&1) is a bit noisy, though. But I have nothing better to suggest, as _1 is... not noisy enough? .map &(_1 * 2)

@jhass
Copy link
Member

jhass commented May 2, 2020

Another proposal was @1, but I would say that's just equally noisy as &. That's why my gut feeling goes to a slight preference to _1 (which should be fine if it's syntax highlighted differently to 1) but I don't really care.

@asterite
Copy link
Member Author

asterite commented May 2, 2020

Yeah, we can use &1,_1, @1, whatever we want. I chose &1 here because it's easier to see that &1 relates to the outer &, and because it's the same way in Elixir. Given that it has already worked in Elixir and people like it, maybe using the same syntax is good.

@RX14
Copy link
Contributor

RX14 commented May 2, 2020

I find these first three examples cryptic:

[1, 2, 3].map &(&1 * 2) # => [2, 4, 6]
[1, 2, 3].map &{&1, &1 * 10} # => [{1, 10}, {2, 20}, {3, 30}]
[1, 2, 3].map &[&1] # => [[1], [2], [3]]

I'd prefer it if this were limited to calling just methods which were in scope. Anything more complex and you have the block syntax, which is short enough as-is.

I'd also prefer to bikeshed this one after 1.0 :)

@straight-shoota
Copy link
Member

I don't like that this effectively introduces a completely new block syntax by prefixing any expression with &.

["foo.cr", "bar.cr"].map &File.exists?(&1)    # shorter, simpler
["foo.cr", "bar.cr"].map { File.exists?(&1) }

It might be a bit shorter than the alternative, but only by one character disregarding whitespace. And I disagree that it's simpler because it's a new syntax you need to learn. Curly braces are familiar.

Adding short syntax for block arguments may be fine, but I would make it work with the existing block syntax.

@oprypin
Copy link
Member

oprypin commented May 2, 2020

Adding short syntax for block arguments may be fine, but I would make it work with the existing block syntax.

So perhaps like...

["foo.cr", "bar.cr"].map { |&1| File.exists?(&1) }

Seriously though, as I have said, this cannot work because this is already existing syntax meaning a block with 0 arguments.

@rafaelfess
Copy link

Another proposal was @1, but I would say that's just equally noisy as &.

I also think it adds a little bit of noise, but found it to be easier to read.

I chose &1 here because it's easier to see that &1 related to the outer &, and because it's the same way in Elixir. Given that it has already worked in Elixir and people like it, maybe using the same syntax is good.

I guess that's why I found it easier to read and understand.

@asterite
Copy link
Member Author

asterite commented May 2, 2020

Some examples found in this repo's source code:

in_files = in_filenames.map { |name| File.open(name, "r") }
# vs.
in_files = in_filenames.map &File.open(&1, "r")

results = results.map { |result| File.join(__DIR__, result) }
# vs.
results = results.map &File.join(__DIR__, &1)

posix = posix.map { |path| Path.posix(path) }
# vs.
posix = posix.map &Path.posix(&1)

versions = sversions.map { |s| SemanticVersion.parse(s) }
# vs.
versions = sversions.map &SemanticVersion.parse(&1)

# This one is maybe a bit too much, but I still include it for completeness
entries
  .select { |dir| Dir.exists?(dir) }
  .sort_by! { |dir| File.info?(dir).try(&.modification_time) || Time.un
  .reverse!
  .skip(10)
  .each { |name| `rm -rf "#{name}"` rescue nil }
# vs.
entries
  .select(&Dir.exists?(&1))
  .sort_by!(&File.info?(&1).try(&.modification_time) || Time.unix(0))
  .reverse!
  .skip(10)
  .each &(`rm -rf "#{&1}"` rescue nil)

block_arg.try { |arg| interpreter.define_var(arg.name, x) }
# vs.
block_arg.try &interpreter.define_var(&1.name, x)

original_filename.try { |filename| File.dirname(filename) }
# vs.
original_filename.try &File.dirname(&1)

@class_vars.try { |v| all_class_vars.merge!(v) }
# vs.
@class_vars.try all_class_vars.merge!(&1)

match["samesite"]?.try { |v| SameSite.parse? v }
# vs.
match["samesite"]?.try &SameSite.parse?(&1)

Note how in all these examples I only used &1. Once you get used to it, you just don't need to name the block argument. For example:

# I read it as: just map all the files to File.open(the file, "r")
in_files = in_filenames.map &File.open(&1, "r")

# Map all sversions to SemanticVersion.parse
sversions.map &SemanticVersion.parse(&1)

# Prepend __DIR__ to every result
results.map &File.join(__DIR__, &1)

Of course this is just my opinion and how I started reading the code above.

@oprypin
Copy link
Member

oprypin commented May 2, 2020

Note how in all these examples I only used &1.

Well yeah, maybe there should be no "&2" and the "&1" should be something that doesn't include the number "1".

@j8r
Copy link
Contributor

j8r commented May 2, 2020

If I understand correctly, this means:

[STDOUT, STDERR].each &.puts

[STDOUT, STDERR].each &puts

Will do two different things. The first example will call IO#puts, and the other will call the top-level puts method.

@asterite
Copy link
Member Author

asterite commented May 2, 2020

@j8r wrong, the second one doesn't compile, it's lacking an &1

@asterite
Copy link
Member Author

asterite commented May 2, 2020

We could use "it" instead of &1 😁

@j8r
Copy link
Contributor

j8r commented May 2, 2020

I like the idea of making proc syntax less verbose, but the syntax may be confusing.
Proc and block syntax may already be confused by beginners, I'm afraid using the & syntax even more won't help.

Being used to shell, and others used to pipes, I'd say [1, 2, 3].each &|puts($1).
I don't know if it is possible, $1 is already a thing in Crystal.

Sumup:

  • &-> for procs
  • &. for block short band
  • &| for pipe-like block short band

@RX14
Copy link
Contributor

RX14 commented May 2, 2020

I've seen both "it" and "_" used as "implicit single-argument names". I'd support that, but it seems that then the proposal is two parts:

  • A shortened block syntax for calling methods .each(&method) == .each { method }
  • An implicit name for the argument of single-argument blocks .map { method(it) } vs .map { |v| method(v) }

I'm mixed on these, though I saw the second proposal working well in Groovy

@jhass
Copy link
Member

jhass commented May 2, 2020

Well there's certainly the idea to only allow the implicit argument in the shorthand syntax to avoid abuse of it in longer blocks.

@j8r
Copy link
Contributor

j8r commented May 2, 2020

Nobody seems to have noticed it, this is the unexpected birth of the pipe operator - at least a functional equivalent.
Combined with #tap(&), this leads to cool things:

"/tmp/file 750".partition(' ').tap &File.chown(&1, &2.to_i(8))

Let's dream a bit, if |> was a thing similar to tap:

"/tmp/file 750".partition(' ') |> &File.chown(&1, &2.to_i(8))

Note: I don't say the syntax is ok, the idea is noice.

@asterite
Copy link
Member Author

asterite commented May 2, 2020

@j8r It might look like it, but I don't think that's the pipe operator. The way the pipe operator works, it pipes the left hand-side operand to either the first (or the last, depending on the language) argument of the next function:

foo |> bar(1)

# same as
bar(1, foo)
# or:
bar(foo, 1)

In the case of Crystal nothing is implicitly piped to the next expression, you have to explicitly use &1, &2, etc.

Elixir has a pipe operator and it also has &(...) for short function syntax, two different things.

@asterite
Copy link
Member Author

asterite commented May 3, 2020

Again, just for fun, I pushed a totally-done-in-the-wrong-way commit that also allows you to use it instead of &1. it still works as a call with arguments (for spec).

Some examples above now can also be written like this:

[1, 2, 3].each &puts(it)

[1, 2, 3].map &(it * 2) # => [2, 4, 6]
[1, 2, 3].map &{it, it * 10} # => [{1, 10}, {2, 20}, {3, 30}]
[1, 2, 3].map &[it] # => [[1], [2], [3]]

["foo.cr", "bar.cr"].map &File.exists?(it)

["World", "Computer"].each &puts("OK #{it}")
# Output:
# OK World
# OK Computer

And more:

in_files = in_filenames.map { |name| File.open(name, "r") }
# vs.
in_files = in_filenames.map &File.open(it, "r")

results = results.map { |result| File.join(__DIR__, result) }
# vs.
results = results.map &File.join(__DIR__, it)

posix = posix.map { |path| Path.posix(path) }
# vs.
posix = posix.map &Path.posix(it)

versions = sversions.map { |s| SemanticVersion.parse(s) }
# vs.
versions = sversions.map &SemanticVersion.parse(it)

# This one is maybe a bit too much, but I still include it for completeness
entries
  .select { |dir| Dir.exists?(dir) }
  .sort_by! { |dir| File.info?(dir).try(&.modification_time) || Time.un
  .reverse!
  .skip(10)
  .each { |name| `rm -rf "#{name}"` rescue nil }
# vs.
entries
  .select(&Dir.exists?(it))
  .sort_by!(&File.info?(it).try(&.modification_time) || Time.unix(0))
  .reverse!
  .skip(10)
  .each &(`rm -rf "#{it}"` rescue nil)

block_arg.try { |arg| interpreter.define_var(arg.name, x) }
# vs.
block_arg.try &interpreter.define_var(it.name, x)

original_filename.try { |filename| File.dirname(filename) }
# vs.
original_filename.try &File.dirname(it)

@class_vars.try { |v| all_class_vars.merge!(v) }
# vs.
@class_vars.try all_class_vars.merge!(it)

match["samesite"]?.try { |v| SameSite.parse? v }
# vs.
match["samesite"]?.try &SameSite.parse?(it)

I still think that &1 in some cases is a bit more readable, because File.exists?(it) reads a bit weird, and if it's just File.exists?(&1) then you don't read the argument (how the heck do you pronounce it?) and you just read "file exists?".

So we could allow both syntaxes to live along. Or... just allow "it" and no more arguments, but the Point.new(&1, &2) case in the original post is nice.

@asterite asterite force-pushed the numbered-block-arguments-2 branch from ba0b041 to 629c8dc Compare May 3, 2020 12:38
@oprypin
Copy link
Member

oprypin commented May 3, 2020

I think having both syntaxes will be an easy "no" for most people, in addition to it itself not being an easy change.

Oh well, if you imagine &1 as a """ligature""" of its own, not as containing the digit "1", using this is not so bad, maybe no big need for it.

@vlazar
Copy link
Contributor

vlazar commented May 3, 2020

Again, just for fun, I pushed a totally-done-in-the-wrong-way commit that also allows you to use it instead of &1. it still works as a call with arguments (for spec).

Maybe it's a totally wrong thing to do, but I like that version with it looks a lot less noisy without all these &1 in addition to initial & starting the block.

Just compare all examples with and without &1 and imagine this will be used a lot. What version would you prefer to read?

#9218 (comment)

#9218 (comment)

@asterite
Copy link
Member Author

asterite commented May 3, 2020

We could always just introduce the feature for just one block argument and using the keyword "it". It's the less noisy version and it'll cover most use cases. With just one argument it's obvious which argument we are referring to (the only one), with more arguments and &1, &2, etc., it becomes much more cryptic, and using names is probably better. I specially like "it" when the block argument is the "unwrapped" version of the original value. With this I mean, the non nilable value using try, each element in an Enumerable, etc.

Using "it" is a breaking change as can be seen in this PR, but it's very easy to fix. Reserving the word "it" exclusively for that is probably good because "it" isn't very meaningful. That said, we can still allow an "it" call for specs or anything else.

Alternatively, we can find a name other than "it".

@asterite
Copy link
Member Author

asterite commented May 3, 2020

Also if we go with "it" we might also could use it in regular blocks. Or just in regular blocks and leave the block arg syntax like it is now.

[1, 2, 3].map { it * 2 } 

However, this is a bit more confusing because the block seems to have no arguments, but it has arguments. With the block arg syntax you can only use it (or &1).

So many options...

@j8r
Copy link
Contributor

j8r commented May 3, 2020

Using $1, $2 will help people coming from the UNIX world (it is possible?).

Wouldn't it conflicts with the spec's it? Even if not, it will be confusing.

I still find &some_function confusing, in my mind it is a function pointer :/.
Having an other sign after the & will disambiguate from &., C's & and Ruby' s &:: maybe &*, or &| – something memorable.

@vlazar
Copy link
Contributor

vlazar commented May 3, 2020

I like this as it looks clean and uses existing block syntax, just changes it fo single argument when you don't care about argument name.

[1, 2, 3].map { it * 2 } 

This also makes is possible to do things like:

[1, 2, 3].map { "#{it.to_s(8)} : #{it.to_s(16)}" }

Which as I understand won't work with & shorthand?

@oprypin
Copy link
Member

oprypin commented May 3, 2020

The current state of the code is that the following works the same:

[1, 2, 3].map &"#{&1.to_s(8)} : #{&1.to_s(16)}"
[1, 2, 3].map &"#{it.to_s(8)} : #{it.to_s(16)}"
#=> ["1 : 1", "2 : 2", "3 : 3"]

@rafaelfess
Copy link

I still find &some_function confusing, in my mind it is a function pointer :/.
Having an other sign after the & will disambiguate from &., C's & and Ruby' s &:: maybe &*, or &| – something memorable.

@j8r I have the same confusion in my mind when it comes to &some_function remembering a pointer.

@vlazar
Copy link
Contributor

vlazar commented May 3, 2020

With introduction of &1, &2 for block arguments in short block syntax discussed here would it make sense to expand Short one-argument syntax too for some symmetry?

["foo.cr", "bar.cr"].map_with_index &.do_stuff   # first, or the only block argument
["foo.cr", "bar.cr"].map_with_index &1.do_stuff  # same as above maybe
["foo.cr", "bar.cr"].map_with_index &2.do_stuff  # same but for the second argument

Similar to

["foo.cr", "bar.cr"].map_with_index &do_stuff(&)  # or `it`
["foo.cr", "bar.cr"].map_with_index &do_stuff(&1) # or `it`
["foo.cr", "bar.cr"].map_with_index &do_stuff(&2)

["foo.cr", "bar.cr"].map_with_index &do_stuff(_)  # or `it`
["foo.cr", "bar.cr"].map_with_index &do_stuff(_1) # or `it`
["foo.cr", "bar.cr"].map_with_index &do_stuff(_2)

@asterite
Copy link
Member Author

asterite commented Jun 5, 2020

@vlazar I think &1.foo and &2.foo are consistent with this proposal, I like it 👍

@asterite asterite closed this Jul 26, 2020
@asterite asterite deleted the numbered-block-arguments-2 branch July 26, 2020 14:47
@Sija
Copy link
Contributor

Sija commented Jul 26, 2020

@asterite That's IMO a damn fine enhancement to the language, why did you close this PR?

@asterite
Copy link
Member Author

Not gonna happen before 1.0, and after 1.0 I'd rather have someone else implement this, or any other thing.

@I3oris
Copy link
Contributor

I3oris commented Dec 6, 2020

It could be to just allow the block argument to be used as the first argument of a given function by writing .each &function.

So

%w(foo bar baz).each { |x| puts x }

becomes

%w(foo bar baz).each &puts

And that all, there don't need to add &1 or something else.

I think other default argument are not needed because these kinds of syntaxes could be generally unclear or difficult to read.

pros:

  • The syntax seem easy to implement because it's the same as &.method but with the block argument given as first argument instead of to be the receiver.
  • It's simple to write and to remember.
  • It's generally not ambiguous nor unclear.
  • It's cover the most of uses cases.
  • This cause no problem with nested block or proc because ist is the same syntax as &.method.

cons:

  • It don't cover the more complex case such as .map &{&1, &1 * 10} or results.map &File.join(__DIR__, &1) but in a other hand this kind of syntaxt could be unclear and difficult to read.
  • Case like .each &puts "var: #{&1}" would be really good but my proposal don't cover.
  • let you find other cons.

Note: the syntax with a receiver (.map &File.exists? ) works because a method call will be accepted after &.

Note2: if other arguments are given (.map &File.open("r")), these could be added as twice, third, ... argument.

Note3: sorry for my clumsy English, i am a young Frenchy x)

Note4: i propose this even it a closed issue in case of this will be hepful after the 1.0.0 version.

Finally, maybe this could be a good compromise between the simplicity, readability, and the benefit of write less.

@Sija
Copy link
Contributor

Sija commented Jan 13, 2021

Any chances on getting this merged after all? That would be a candy... @asterite?

@asterite
Copy link
Member Author

I would merge it, but I'm not the only one taking decisions in this project. Plus this could always go after 1.0, like in 1.1 and so on.

@Sija
Copy link
Contributor

Sija commented Jan 13, 2021

@asterite So could you reopen it so there will a chance for merging?

@straight-shoota
Copy link
Member

straight-shoota commented Jan 13, 2021

The thing is: This (and #9216) are essentially a proof of concept. They are good quality and could potentially be merged as is (maybe; prob need a few updates now).
But: There's just not yet a clear decision on syntax and semantics. And right now (i.e. before 1.0) there are more important things to do, like fixing bugs and improving essential features. This is from a change management perspective, a relatively simple feature addition and will probably follow after 1.0.

@asterite
Copy link
Member Author

@Sija Reopening this is as easy as clicking a button. But if I do that then it will look like this will be merged soon, or that this is being discussed, but that's not the truth. So I'd rather keep it closed because right now there's no chance this will be included in 1.0.

@jkthorne
Copy link
Contributor

Since there is a 0.36 release could this make it into the 1.0 release?

@straight-shoota
Copy link
Member

The intermediary 0.36.0 release doesn't change anything. The goal is still to focus on stability for 1.0.

@asterite asterite mentioned this pull request Nov 24, 2021
@Sija
Copy link
Contributor

Sija commented Nov 24, 2021

Could this be revived?

@straight-shoota
Copy link
Member

Sure. I think the best way forward is to start with a fresh RFC which summarizes the previous discussions (here and related) and describes the proposed alternatives with their pros/cons.

That would help to understand what's on the table and bring us closer to taking a decision.

@beta-ziliani beta-ziliani added tough-cookie Multi-faceted and challenging topic, making it difficult to arrive at a straightforward decision. kind:feature topic:lang labels Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature topic:lang tough-cookie Multi-faceted and challenging topic, making it difficult to arrive at a straightforward decision.
Projects
None yet
Development

Successfully merging this pull request may close these issues.