Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify rules about commas in match arms and semicolons in expressions #40989

Merged
merged 3 commits into from
Jul 19, 2017

Conversation

matklad
Copy link
Member

@matklad matklad commented Apr 1, 2017

Original discussion: https://internals.rust-lang.org/t/syntax-of-block-like-expressions-in-match-arms/5025/7.

Currently, rust uses different rules to determine if , is needed after an expression in a match arm and if ; is needed in an expression statement:

fn stmt() {
    # no need for semicolons
    { () }
    if true { () } else { () }
    loop {}
    while true {}
}

fn match_arm(n: i32) {
    match n {
        1 => { () } # can omit comma here
        2 => if true { () } else { () }, # but all other cases do need commas. 
        3 => loop { },
        4 => while true {},
        _ => ()
    }
}

This seems weird: why would you want to require , after and if?

This PR unifies the rules. It is backwards compatible because it allows strictly more programs.

@rust-highfive
Copy link
Collaborator

r? @nrc

(rust_highfive has picked a reviewer for you, use r? to override)

@@ -3093,8 +3093,7 @@ impl<'a> Parser<'a> {
self.expect(&token::FatArrow)?;
let expr = self.parse_expr_res(Restrictions::RESTRICTION_STMT_EXPR, None)?;

let require_comma =
!classify::expr_is_simple_block(&expr)
let require_comma = classify::expr_requires_semi_to_be_stmt(&expr)
Copy link
Contributor

@petrochenkov petrochenkov Apr 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add ExprKind::Catch to expr_requires_semi_to_be_stmt, it's missing.
(A copy of fn expr_requires_semi_to_be_stmt in librustc/hir/print.rs is also badly out-of-date.)

Copy link
Member Author

@matklad matklad Apr 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A copy of fn expr_requires_semi_to_be_stmt in librustc/hir/print.rs is also badly out-of-date.

Is it? It seems legit, and even already handles catch (catch is just a block in hir, isn't it?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right, nevermind then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added Catch to the one in libsyntax though.

@petrochenkov
Copy link
Contributor

petrochenkov commented Apr 1, 2017

LGTM, but needs a @rust-lang/lang decision.

@@ -0,0 +1,35 @@
fn main() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in the filename :)

Copy link
Member Author

@matklad matklad Apr 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! That's why I am always not sure if it's a coma or a comma: they are spelled differently : ) ! Thanks!

@withoutboats
Copy link
Contributor

Can someone with the access tag this T-lang & I-nominated?

@petrochenkov petrochenkov added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Apr 2, 2017
@nrc
Copy link
Member

nrc commented Apr 6, 2017

@rfcbot fcp merge

We discussed this at the lang team meeting. We all love it, but the meeting was not well attended, so just making sure everyone signs off on this.

@rfcbot
Copy link

rfcbot commented Apr 6, 2017

Team member @nrc has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@nrc nrc removed the I-nominated label Apr 6, 2017
@withoutboats
Copy link
Contributor

@rfcbot reviewed

I am so excited by this. I accidentally omit the comma after a control flow expression all the time! I had always assumed there was just some good reason things were like this.

@matklad
Copy link
Member Author

matklad commented Apr 6, 2017

I accidentally omit the comma after a control flow expression all the time!

For posterity, I've discovered this issue when I had implemented automatic comma insertion/deletion in IntelliJ Rust, and found out that it breaks my code :P

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 11, 2017

Huh. I have to admit, I am a bit surprised by the current behavior of the parser, but not in the way that others are. In particular, I would have expected this to parse today, but it doesn't seem to:

fn main() {
    let x = 3;
    match x {
        1 => if true { 1 } else { 2 } - 2,
        //                            ^ parse error here
        -2 => 3,
        _ => 4,
    }
}

(For a bit of history, the old rule was that all match arms required {} and no comma was permitted; we changed it to permit something other than {}, but to require a comma in that case, precisely to handle small expression cases like this one. Not that this matters. Just laying out the history here.)

So I'm mildly nervous about making this change without specifying it more clearly. The current rules for handling of semicolon are somewhat subtle, after all. The way I would summarize them is:

  • There are a set of "statement-like" expressions, roughly corresponding to statements in C.
  • If we are at the start of a statement, those statement-like expressions are terminated eagerly, even if there is no semicolon, but are later required to have type unit.

Therefore this example:

fn main() {
    if true { 2 } else { 3 } - 2;
}

Is roughly equivalent to this:

fn main() {
    let () = if true { 2 } else { 3 }; - 2;
}

The intention of the () rule was precisely to avoid confusion in cases like the previous one, where one might have expected a binary operator to be used, but in fact you get a unary operator. I've since thought that perhaps this should have been a lint, but right now it's a hard error.

Are we going to apply a similar rule to match statements? I presume not, I guess. Are we sure though that there is no ambiguity? It seems that the current parser does "end parsing early" when you use a statement-like form, so perhaps that works out fine.

@matklad
Copy link
Member Author

matklad commented Apr 11, 2017

I would have expected this to parse today, but it doesn't seem to:

I actually was sure that this change would be impossible because of the backwards incompatibility, and was pleasantly surprised to find out that we already install the "statement-mode" restriction, and only then specifically check for plain blocks. There's a test for this.

Are we going to apply a similar rule to match statements? I presume not, I guess. Are we sure though that there is no ambiguity? It seems that the current parser does "end parsing early" when you use a statement-like form, so perhaps that works out fine.

The situation here is a bit different, so it looks like we can avoid the rule. In statements, the expression following the } can be interpreted as an expression-statement itself and lead to a valid (but potentially surprising) parse with different semantics. Here, the part after } will be interpreted as a pattern, and this is unlikely to lead to a valid parse because patterns in match arms are followed by =>. There is a potential problem if an expression and a pattern can be glued together to form a new pattern, but this I believe is not possible with the current grammar. Though this could become a problem if we allow half-infinite ranges in patterns:

  let r = match -2 {
    0 => { 1 } - 1
    ... 2 => { 2 }
    _ => 3
  };

But, because we already do not require a comma after a simple block, I think all potential future ambiguities are already there, so lifting the restriction does not harm.

That said, digging through git history I've found out that once upon a time rust has exactly the logic from this PR, which was explicitelly changed to "allow only simple blocks": matklad@2772b2e

Not sure what was the motivation for that change at that time...

@nikomatsakis
Copy link
Contributor

Not sure what was the motivation for that change at that time...

Huh. I have no memory of that. =) Good question.

@shepmaster shepmaster added the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Apr 14, 2017
@matklad
Copy link
Member Author

matklad commented Apr 14, 2017

cc @brson any chance that you remember the motivation for that change (2772b2e) ? :)

@Mark-Simulacrum
Copy link
Member

@nikomatsakis @pnkfelix Looks like this is waiting for your approval.

@nikomatsakis
Copy link
Contributor

I have to admit, I still find this change very perplexing. It's not something I would ever expect to work: but I think I'm just an old fuddy duddy or something, too wedded to the historical evolution here. Certainly I can see the appeal of reusing the category of "block-like things". (The other reason for my discomfort is a general reluctance to go mucking about without an official reference grammar that we actually trust, but that's another story.)

So, I'd be happy to r+, however, I think that would be a violation of policy. In particular, I think we should not make this change without a tracking issue and a feature-gate (we're already skipping past an RFC here, it seems). If those are added, I'd be happy to r+.

@alexcrichton alexcrichton added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels May 25, 2017
@nikomatsakis
Copy link
Contributor

ping @matklad -- any objection to adding a tracking issue and feature gate support for this?

@matklad
Copy link
Member Author

matklad commented May 26, 2017

Sure, I'll try to figure out the feature-gating machinery on this weekend and update the PR!

If someone can give an example of a feature-gated syntax change, that would be super helpful :)

@nikomatsakis
Copy link
Contributor

Hmm, so it's mildly tricky to do I guess. In particular, we don't know what features are enabled until we've parsed (at least) the crate attributes! I think then the best way to do it would be that we have to reflect (in the AST) whether a , was present and then check after the fact (e.g., in the visitor in libsyntax/feature_gate.rs -- search for impl<'a> Visitor<'a> for PostExpansionVisitor<'a>) for that particular sort of AST.

I am now debating if this much energy is worth it. What will we learn from the unstable period anyway? (But still it does feel like the proper procedure.)

@Mark-Simulacrum
Copy link
Member

I'm going to tag as S-waiting-on-team since I feel like the lang team needs to make a decision here (specifically, seems like @pnkfelix and @nikomatsakis need to sign off).

Personally, I wouldn't be too opposed to landing without a feature gate, but also kind of am worried about doing that in general.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 4, 2017
@matklad
Copy link
Member Author

matklad commented Jun 4, 2017

Sorry, looks like overestimated the amount of my free time :) I am still wanting to bring this to completion though :)

I agree that this, being a change to the language, needs some sort of process to make sure it does not have any unintended consequences.

I don't think that feature-gating really helps here: probably nobody is going type #feature[match_comma] just to save a couple of commas :(

We can punt on this until we have a formal Rust grammar, though it does not seem likely to happen soon. Another alternative is to ask more people to look at this change (by posting a mini-RFC to urlo/irlo) and to hope that all problems would be uncovered that way. But there already was a thread on irlo about this PR, so probably everyone interested already has took a glance.

@matklad
Copy link
Member Author

matklad commented Jul 17, 2017

Will rebase shortly!

@matklad
Copy link
Member Author

matklad commented Jul 18, 2017

Ok, rebased. Local rebase happened without merge conflicts. Does this mean that the PR conflicts with something in bors's queue, which has not yet hit master?

@Mark-Simulacrum
Copy link
Member

Nah, just bors being rebase-happy with submodules.

@bors r=petrochenkov

@bors
Copy link
Contributor

bors commented Jul 18, 2017

📌 Commit 9a7cb93 has been approved by petrochenkov

@bors
Copy link
Contributor

bors commented Jul 18, 2017

⌛ Testing commit 9a7cb93 with merge 648810278deefbc71f9c0de27347de532aed979e...

@bors
Copy link
Contributor

bors commented Jul 18, 2017

💔 Test failed - status-travis

@est31
Copy link
Member

est31 commented Jul 18, 2017

@bors retry

@bors
Copy link
Contributor

bors commented Jul 18, 2017

⌛ Testing commit 9a7cb93 with merge 760ed3b...

bors added a commit that referenced this pull request Jul 18, 2017
Unify rules about commas in match arms and semicolons in expressions

Original discussion: https://internals.rust-lang.org/t/syntax-of-block-like-expressions-in-match-arms/5025/7.

Currently, rust uses different rules to determine if `,` is needed after an expression in a match arm and if `;` is needed in an expression statement:

```Rust
fn stmt() {
    # no need for semicolons
    { () }
    if true { () } else { () }
    loop {}
    while true {}
}

fn match_arm(n: i32) {
    match n {
        1 => { () } # can omit comma here
        2 => if true { () } else { () }, # but all other cases do need commas.
        3 => loop { },
        4 => while true {},
        _ => ()
    }
}
```

This seems weird: why would you want to require `,` after and `if`?

This PR unifies the rules. It is backwards compatible because it allows strictly more programs.
@bors
Copy link
Contributor

bors commented Jul 18, 2017

💔 Test failed - status-travis

@petrochenkov
Copy link
Contributor

@matklad
Failure in a pretty printing test, but it's a pretty-printer bug (#37199), so you can just add // ignore-pretty issue #37199 to suppress it.

@matklad
Copy link
Member Author

matklad commented Jul 18, 2017

Thanks for the hint, @petrochenkov !

@petrochenkov
Copy link
Contributor

There are unintended submodule updates in 75c1dfd

@matklad
Copy link
Member Author

matklad commented Jul 18, 2017

@petrochenkov fixed!

@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Jul 18, 2017

📌 Commit e3d052f has been approved by petrochenkov

@bors
Copy link
Contributor

bors commented Jul 19, 2017

⌛ Testing commit e3d052f with merge 83c3621...

bors added a commit that referenced this pull request Jul 19, 2017
Unify rules about commas in match arms and semicolons in expressions

Original discussion: https://internals.rust-lang.org/t/syntax-of-block-like-expressions-in-match-arms/5025/7.

Currently, rust uses different rules to determine if `,` is needed after an expression in a match arm and if `;` is needed in an expression statement:

```Rust
fn stmt() {
    # no need for semicolons
    { () }
    if true { () } else { () }
    loop {}
    while true {}
}

fn match_arm(n: i32) {
    match n {
        1 => { () } # can omit comma here
        2 => if true { () } else { () }, # but all other cases do need commas.
        3 => loop { },
        4 => while true {},
        _ => ()
    }
}
```

This seems weird: why would you want to require `,` after and `if`?

This PR unifies the rules. It is backwards compatible because it allows strictly more programs.
@bors
Copy link
Contributor

bors commented Jul 19, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: petrochenkov
Pushing 83c3621 to master...

@bors bors merged commit e3d052f into rust-lang:master Jul 19, 2017
alexcrichton added a commit to alexcrichton/syn that referenced this pull request Aug 28, 2017
This was updated recently (rust-lang/rust#40989) so update syn to match
@matklad matklad deleted the comma-arms branch July 9, 2019 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.