Parsing inconsistencies (lambda, proc, return) #28784

rprichard · 2015-10-01T01:33:14Z

I found more inconsistencies between rustc and parser-lalr.

I also noticed that Rust allows return expressions and lambda expressions to end with a struct literal, even when they're in a nostruct context. This seems inconsistent to me.

Lambdas (the two parsers disagree):

struct A { a: i32 }
fn lambda_expr_nostruct() -> A {
    // rustc accepts this, but parser-lalr does not.
    match || A { a: 123 } {
        f => f()
    }
}

Return expressions (the two parsers agree):

struct A { a: i32 }
fn return_ambiguity_1() -> A {
    match A { a: 1 } { x => x } // rejected by rustc and parser-lalr
}
fn return_ambiguity_2() -> A {
    match return A { a: 1 } { _ => A { a: 1 } } // accepted by rustc and parser-lalr
}

The rustc and parser-lalr parsers disagree about whether a bare return expression can be cast:

fn cast_of_return() {
    // rustc rejects, parser-lalr accepts
    // error: expected identifier, found keyword `as`
    return as ();
    (return as ());

    return == (); // rustc accepts, parser-lalr accepts
    loop {
        continue as (); // rustc accepts, parser-lalr accepts
        continue == (); // rustc accepts, parser-lalr accepts
        break as ();    // rustc accepts, parser-lalr accepts
        break == ();    // rustc accepts, parser-lalr accepts
    };
}

Finally, I also noticed these two differences, which seem much less interesting to me. The grammar is probably just out-of-date or buggy:

lambda sometimes requires braces:

fn lambda_braces() {
    // parser-lalr accepts this, but rustc does not.  I think this is an
    // obvious bug in the parser-lalr.y grammar.  If there is a return type,
    // then curly braces are required.
    let _x = || -> i32 10;
}

proc is obsolete:

fn proc_syntax() {
    // parser-lalr also accepts this.  I think the proc syntax is obsolete, and
    // the {proc_expr, proc_expr_nostruct} non-terminals could be removed from
    // parser-lalr.y.
    let _x = proc() {};
}

The text was updated successfully, but these errors were encountered:

steveklabnik · 2015-10-04T23:59:50Z

/cc @rust-lang/lang , can we disambiguate what's intended here?

nikomatsakis · 2015-10-06T19:47:44Z

I agree with @rprichard's take for the most part. I guess that return as i32 ought to parse...weird as it is.

triage: P-medium

brson · 2016-07-14T17:15:24Z

Probably a good beginner bug for someone familiar with parsing.

brson · 2016-07-15T23:11:35Z

Make the rustc parser behavie like parser-lalr, per the op.

neunenak · 2016-07-20T06:50:17Z

I'm reaaonably familiar with parsers and I'd like to take a crack at this bug.

brson · 2016-07-20T18:23:57Z

@neunenak You got it!

brson · 2016-07-20T18:31:36Z

@nikomatsakis @nrc @pnkfelix Can you confirm the right thing to do here is make libsyntax behave like the reference parser as described in the OP?

nrc · 2016-07-20T22:45:01Z

It's probably worth addressing any inconsistencies between the parser the reference on an individual basis. I don't have enough faith in the reference to say it is always right.

My opinions on the issues in the OP:

I think return starts a new context, so it is OK to accept a struct literal after it. I don't see a parsing ambiguity there, so I think it is OK. Since the parser and reference agree, I don't think there is anything to do here.

return expression - I thought return was a statement, so I am not qualified to offer an opinion here as my intuition is messed up. Could someone explain why it is an expression not a statement please?

agree on the last two points - bugs in the reference.

brson · 2016-07-22T22:10:47Z

return is mostly an expression because most everything's an expression, but it does serve some purposes, like allowing it to appear in match arms.

cramertj · 2017-01-24T07:53:52Z

@neunenak Are you still interested in this? If not, I'd like to take a stab at it.

neunenak · 2017-01-24T09:11:26Z

I'm afraid I had more trouble with it than I thought, please go ahead and take a stab.

…

On Mon, Jan 23, 2017 at 11:54 PM, Taylor Cramer ***@***.***> wrote: @neunenak <https://github.com/neunenak> Are you still interested in this? If not, I'd like to take a stab at it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28784 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AATA-WBhZjRJrb3XXKr0Tq2LfLt9RJ7Xks5rVa4jgaJpZM4GG9HU> .

cramertj · 2017-01-24T09:58:47Z

Thanks! I took a look at the parser, and I think the issue with return is this match. Many (keyword) identifiers cannot appear at the start of an expression, as being just one of these. Should I just cover as for now, or should I match out all identifiers which cannot be in the start of an expression?

Edit: for context, the can_begin_expr method is used by the parser to determine whether or not return is returning a value or whether it is just a bare return (see here).

cramertj · 2017-01-25T21:51:06Z

I went ahead and opened #39303 fixing can_begin_expr for the as case. Let me know if I should fix it for any other keywords.

Moving forward, what's left to do for this issue? Are all the rest just parser-lalr fixes? I don't think the lambda issue can be fixed backward-compatibly, can it? (Since we'd be stopping something from parsing that used to parse.)

Edit: wound up making a new PR handling the remaining can_begin_expr cases on @petrochenkov's advice.

nikomatsakis · 2017-01-27T15:49:32Z

@cramertj a good question. One of my long-standing "to do" items has been to pick up work on rustypop -- in particular adding a test harness -- as a replacement for parser-lalr. When I was porting the LALR grammar, I found a number of irregularities that struck me as wrong -- often holdovers from the early days of Rust -- as well as various things whose meaning were not obvious (e.g., precedence tricks). The LALRPOP port is free of those problems.

Separately, I think we have put off reaching a firm decision on a lot of these questions. This needs some organization and I think no one has had the time.

I would love to find someone who would be interested in collaborating with me on one or both aspects of this project. If you are interesting, please ping me on irc (nmatsakis) or drop me an e-mail (nmatsakis@mozilla.com).

cramertj · 2017-01-27T20:38:56Z

@nikomatsakis Email'd you.

…x, r=petrochenkov Fix can_begin_expr keyword behavior Partial fix for rust-lang#28784.

colinmarsh19 · 2017-10-08T06:59:43Z

This seems like an interesting fix. I'll take a look at it and see what I can figure out. -- Colin

jonas-schievink · 2020-02-20T20:13:10Z

parser-lalr has been removed in #64896. Closing in favor of the work done by the grammar working group in https://github.com/rust-lang/wg-grammar/.

steveklabnik added the A-lang label Oct 4, 2015

nikomatsakis added the A-parser Area: The parsing of Rust source code to an AST. label Oct 6, 2015

rust-highfive added the P-medium Medium priority label Oct 6, 2015

nikomatsakis added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 6, 2015

brson added P-low Low priority E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. and removed P-medium Medium priority labels Jul 14, 2016

nikomatsakis mentioned this issue Nov 4, 2016

Implement the loop_break_value feature. #37487

Merged

cramertj mentioned this issue Jan 25, 2017

Fix parsing for casting an empty return expression #39303

Closed

cramertj mentioned this issue Jan 27, 2017

Fix can_begin_expr keyword behavior #39335

Merged

alexcrichton added a commit to alexcrichton/rust that referenced this issue Jan 28, 2017

Rollup merge of rust-lang#39335 - cramertj:cramertj/can_begin_expr_fi…

915242a

…x, r=petrochenkov Fix can_begin_expr keyword behavior Partial fix for rust-lang#28784.

steveklabnik added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed A-lang labels Mar 24, 2017

Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 24, 2017

Manishearth removed the hacktoberfest label Sep 28, 2017

jonas-schievink closed this as completed Feb 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing inconsistencies (lambda, proc, return) #28784

Parsing inconsistencies (lambda, proc, return) #28784

rprichard commented Oct 1, 2015

steveklabnik commented Oct 4, 2015

nikomatsakis commented Oct 6, 2015

brson commented Jul 14, 2016

brson commented Jul 15, 2016

neunenak commented Jul 20, 2016

brson commented Jul 20, 2016

brson commented Jul 20, 2016

nrc commented Jul 20, 2016

brson commented Jul 22, 2016

cramertj commented Jan 24, 2017

neunenak commented Jan 24, 2017 via email

cramertj commented Jan 24, 2017 •

edited

Loading

cramertj commented Jan 25, 2017 •

edited

Loading

nikomatsakis commented Jan 27, 2017

cramertj commented Jan 27, 2017

colinmarsh19 commented Oct 8, 2017

jonas-schievink commented Feb 20, 2020

Parsing inconsistencies (lambda, proc, return) #28784

Parsing inconsistencies (lambda, proc, return) #28784

Comments

rprichard commented Oct 1, 2015

steveklabnik commented Oct 4, 2015

nikomatsakis commented Oct 6, 2015

brson commented Jul 14, 2016

brson commented Jul 15, 2016

neunenak commented Jul 20, 2016

brson commented Jul 20, 2016

brson commented Jul 20, 2016

nrc commented Jul 20, 2016

brson commented Jul 22, 2016

cramertj commented Jan 24, 2017

neunenak commented Jan 24, 2017 via email

cramertj commented Jan 24, 2017 • edited Loading

cramertj commented Jan 25, 2017 • edited Loading

nikomatsakis commented Jan 27, 2017

cramertj commented Jan 27, 2017

colinmarsh19 commented Oct 8, 2017

jonas-schievink commented Feb 20, 2020

cramertj commented Jan 24, 2017 •

edited

Loading

cramertj commented Jan 25, 2017 •

edited

Loading