Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend RFC 550 with (expanded) abstract specification rather than algorithm #1384

Merged
merged 6 commits into from
Dec 17, 2015

Conversation

pnkfelix
Copy link
Member

(tracking issue: rust-lang/rust#30450)

Massive redo of Detailed Design of RFC #550; expanded parts of spec in passing.

As noted in the Edit History addition:

  • replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm.
  • fixed some oversights in the specification that led to matchers like $e:expr { stuff } being accepted (which match fragments like break { stuff }, significantly limiting future language extensions),
  • expanded the follows sets for ty to include OpenDelim(Brace), Ident(where), Or (since Rust's grammar already requires all of |foo:TY| {}, fn foo() -> TY {} and fn foo() -> TY where {} to work).
  • expanded the follow set for pat to include Or (since Rust's grammar already requires match (true,false) { PAT | PAT => {} } and |PAT| {} to work). See also RFC issue Allow pat macro matcher be followed by | #1336.

Not noted in Edit History addition:

  • expanded/revised terminology section to fit new detailed design
  • added "examples of valid and invalid matchers" subsection, that uses the
    specification from detailed design to explain why each is valid/invalid.
  • rewrote the algorithm to actually implement the (new) specification, and
    moved the discussion of the algorithm to a non-binding appendix.

diff of amendment (split view)

As noted in the Edit History addition:

  * replaced detailed design with a specification-oriented presentation rather than an implementation-oriented algorithm.

  * fixed some oversights in the specification (that led to matchers like `break { stuff }` being accepted),

  * expanded the follows sets for `ty` to include `OpenDelim(Brace), Ident(where), Or` (since Rust's grammar already requires all of `|foo:TY| {}`, `fn foo() -> TY {}` and `fn foo() -> TY where {}` to work).

  * expanded the follow set for `pat` to include `Or` (since Rust's grammar already requires `match (true,false) { PAT | PAT => {} }` and `|PAT| {}` to work). See also [RFC issue 1336][].

Not noted in Edit History addition:

  * expanded/revised terminology section to fit new detailed design

  * added "examples of valid and invalid matchers" subsection, that uses the
    specification from detailed design to explain why each is valid/invalid.

  * rewrote the algorithm to actually implement the (new) specification, and
    moved the discussion of the algorithm to a non-binding appendix.
@pnkfelix
Copy link
Member Author

I forgot to mention it in amendment as currently written, but: This change is strongly motivated by rust-lang/rust#25658

(that is, this change, along with an implementation I am prototyping, is meant to fix that issue).

One thing I have not addressed at all here is whether the change described here would go through warning cycle. I am not sure whether I can easily do that (though in principle I guess I could just run both analyses, and set the new one to warn for the initial release cycle).

@pnkfelix
Copy link
Member Author

cc @cmr @rust-lang/lang

@pnkfelix
Copy link
Member Author

One thing that we may want, in order to land this in practice, is a fragment specifier for the subclass of expressions that corresponds to expressions that can be used as a test in an if; i.e., expressions that cannot be followed by a { token.

The reason that we may want such a fragment specifier is because a crater run with my prototype signaled that some crates in the wild are relying on the ability to put { after an $t:expr NT, which was not supposed to be allowed under the original intent of this RFC.

Adding a new fragment specifier that those crates could use instead of expr would solve this; maybe call it $t:pred, for "predicate" ? Or if you prefer, $t:moelarry (which is of course naturally followed by "curly") 😉.

@emberian
Copy link
Member

This is excellent work @pnkfelix. :shipit:

@emberian
Copy link
Member

I found a typo, but I can't find it again...

```

Examples of FOLLOW (expressed as equality relations between sets, to avoid
incoporating details of FOLLOW(NT) in these examples):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is! s/inco/incor/

@pnkfelix
Copy link
Member Author

pnkfelix commented Dec 1, 2015

So hopefully I'll have a chance soon to incorporate this properly into the RFC, but two new things:

  • I prototyped a way to warning cycle this. (I think I wrote this idea somewhere else.) Just run both analyses: if the new one accepts, then accept; if they both reject, then reject; and if the old accepts but the new rejects, then issue a warning (and note that it will be promoted to a hard error).
  • I did a crater run on my most recent implementation, with the warning above promoted to an error so that we actually see regressions reported). I think this latest prototype follows the specification given here.
    • Update: removed links to erroneous crater run, which is superceded by more recent run linked in comment below.

@pnkfelix
Copy link
Member Author

pnkfelix commented Dec 2, 2015

Did a new crater run (the previous one was bad because I was comparing against an old baseline, and so the set of crates had changed in the meantime). Very promising results.

analysis here

Summary:

  • 10 total root regressions reported; 1 is probably a false positive, leaving 9 legitimate root regressions.
  • 3 regressions are due to expr is followed by { (which I addressed in an earlier comment).
  • 2 regressions show expr followed by ..., which is just a bad macro (see In macros, $($x:expr),* fragments can be used to bypass future-proofing restrictions rust#25658 ); i.e. we expect to break this.
  • 2 regressions show expr or ty followed by semi-randomly selected identifiers (functions, varargs, copy, swap, perm, drop). This is again just a bad macro
  • 2 regressions show ty followed by [. (In theory we could add [ to FOLLOW(ty), but its not clear if its well motivated.)
  • 1 regression due to expr followed by _. (In theory we could add _ to FOLLOW(expr), but its not clear if its well motivated.)
  • (The bullets add up to more than 9 because one of the regressions, dlib-0.2.0, falls into two of the buckets above.)

As a fun aside, the prototype does fix one crate (though I suspect its not a terribly vital one): https://github.com/arthurtw/echo-rs/blob/master/src/lib.rs#L37


The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`,
`expr`, `ty`, `ident`, `path`, `meta`, and `tt`.

- `FOLLOW(pat)` = `{FatArrow, Comma, Eq}`
- `FOLLOW(pat)` = `{FatArrow, Comma, Eq, Or, Ident(if), Ident(in)}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me if we want to add | here, #1336 (comment) .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so how would that work in the presence of closures, i.e. let f = |VariantA(x, _)|VariantB(_, x)| { ... }; ? Seems hard to parse to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(though I guess we would just not allow the generalized pattern form in that context?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

((which is guess is basically what you said in the comment you linked...))

@nikomatsakis
Copy link
Contributor

Hear ye, hear ye! This RFC amendment is now entering final comment period.

@nikomatsakis nikomatsakis added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Dec 4, 2015
@SimonSapin
Copy link
Contributor

For those like me coming from the subteams report and lacking context: RFC 550 is Macro future-proofing.

@pnkfelix
Copy link
Member Author

@nikomatsakis @huonw so, what do you two think: Should I take | back out of FOLLOW(pat), and let us resolve that issue separately (it is already tracked on #1336) ? How high are the chances of us wanting to extent pat to support a | b as a pat, while being unwilling to force the user to write it as (a | b) instead?

My personal inclination is to add | to FOLLOW(pat)`, and use the "require surrounding parens" as the solution for future extension.

@nikomatsakis
Copy link
Contributor

@pnkfelix I say add it

@pnkfelix
Copy link
Member Author

lang team discussed, and decided RFC amendment is good as-is (i.e., it is okay that it is adding | to the FOLLOW set for pat).

pnkfelix added a commit that referenced this pull request Dec 17, 2015
Amend RFC 550 with (expanded) abstract specification rather than algorithm
@pnkfelix pnkfelix merged commit 3a16a14 into rust-lang:master Dec 17, 2015
@pnkfelix
Copy link
Member Author

and also durka42 on irc asked about the $e:moelarry idea; the short answer there is that we need not couple that extension with this RFC. We can put in the stricter rules that we had intended to have all along, and orthogonally add extensions like that to the classes of macro_rules fragments.

pnkfelix added a commit to pnkfelix/rust that referenced this pull request Jan 7, 2016
…s validity.

See RFC amendment 1384 and tracking issue 30450:
  rust-lang/rfcs#1384
  rust-lang#30450

Moved old check_matcher code into check_matcher_old

combined the two checks to enable a warning cycle (where we will
continue to error if the two checks agree to reject, accept if the new
check says accept, and warn if the old check accepts but the new check
rejects).
pnkfelix added a commit to pnkfelix/rust that referenced this pull request Jan 7, 2016
@SimonSapin
Copy link
Contributor

This is making invalid a macro that I’ve been using for about a year, and I don’t know how to fix it. I’d appreciate some help: https://users.rust-lang.org/t/macro-rules-used-like-a-match-expression/4328

@durka
Copy link
Contributor

durka commented Jan 18, 2016

I replied in the Discourse thread. It's not pretty :)

On Mon, Jan 18, 2016 at 12:50 PM, Simon Sapin notifications@github.com
wrote:

This is making invalid a macro that I’ve been using for about a year, and
I don’t know how to fix it. I’d appreciate some help:
https://users.rust-lang.org/t/macro-rules-used-like-a-match-expression/4328


Reply to this email directly or view it on GitHub
#1384 (comment).

SimonSapin added a commit to servo/rust-cssparser that referenced this pull request Jan 19, 2016
Change the usage syntax to require a comma on the last non-fallback arm,
which is a [breaking-change].

The old definition has started to emit a warning in recent nightlies
and is likely to be an error in future versions:
rust-lang/rfcs#1384

The new definition is recursive to resolve ambiguities.
Unfortunately this makes error reporting terrible:
rust-lang/rust#31022
bors-servo pushed a commit to servo/rust-cssparser that referenced this pull request Jan 19, 2016
Make match_ignore_ascii_case! future-proof.

Change the usage syntax to require a comma on the last non-fallback arm, which is a [breaking-change].

The old definition has started to emit a warning in recent nightlies and is likely to be an error in future versions: rust-lang/rfcs#1384

The new definition is recursive to resolve ambiguities. Unfortunately this makes error reporting terrible: rust-lang/rust#31022

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/rust-cssparser/91)
<!-- Reviewable:end -->
@Centril Centril added A-macros Macro related proposals and issues A-syntax Syntax related proposals & ideas labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Macro related proposals and issues A-syntax Syntax related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants