Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New RFC: proc-macro-attribute-recursion #2628

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

llogiq
Copy link
Contributor

@llogiq llogiq commented Jan 23, 2019

This breaks out a small part of #2320 that is simple to implement and reason about and will give us a simple workable solution to macro expansion in proc macros while we wait for the more complete solution to emerge.

Rendered

@llogiq llogiq force-pushed the proc-macro-attribute-recursion branch from 8b017e6 to 91ac6d3 Compare January 23, 2019 11:39
@llogiq llogiq force-pushed the proc-macro-attribute-recursion branch from ae7e822 to 665feb4 Compare January 23, 2019 16:11
@ExpHP
Copy link

ExpHP commented Jan 23, 2019

So.... what currently happens if a proc_macro_attribute adds another proc_macro_attribute attribute to the output during expansion?

@llogiq
Copy link
Contributor Author

llogiq commented Jan 24, 2019

It just gets ignored.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The expander is extended to search the expansion of `proc_macro` and `proc_macro_attributes` for other macro invocations. Those are then expanded until there are no more attributes or macro invocations left or the macro expansion limit is reached, whichever comes first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section (and the RFC in general) seems rather underspecified; I'd like to see examples of proc_macro added to the previous section at least. This also doesn't seem like a full description of behavior since #[flame] gets applied to macro expansions. This also doesn't say what happens if you write #[flame(alpha, beta)]. This should be specified including with examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't detail what #[flame(alpha, beta)] does (currently it should give you an error), because this RFC does not change this part of functionality. As for bang-macros, it doesn't matter if they are defined via macro, macro_rules! or as a proc macro. Again the functionality of expansion is unchanged.

The only two things this RFC defines: That the output of proc_macro_attributes get expanded and that the order of expansion follows the order of appearance (so that things from the original code get expanded before things added by the expansion).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, from what I can tell, the reference says nothing about how #[flame] gets attached to this_is_fun.

That the output of proc_macro_attributes get expanded and that the order of expansion follows the order of appearance (so that things from the original code get expanded before things added by the expansion).

That's not what the text says; it says "The expander is extended to search the expansion of proc_macro and proc_macro_attributes for other macro invocations." -- you've added examples for the latter but not the former.

text/0000-proc-macro-attribute-recursion.md Outdated Show resolved Hide resolved
text/0000-proc-macro-attribute-recursion.md Outdated Show resolved Hide resolved
Implementors will have to make sure to order the expansions within expanded output by their origin: macros which are in the `proc_macro_attribute`s' input need to be expanded before expanding macros that have been added by the `proc_macro_attribute`s themselves. This can easily be done by examining the `Span`s of the expansion and ordering them by `SyntaxContext`.

# Drawbacks
[drawbacks]: #drawbacks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior of attaching #[flame] to expansion of macros is, as far as I can see, theoretically a breaking change if attaching the attribute has an effect on static or dynamic semantics of the expansion. I'm surprised that this behavior is not opt-in. It seems the proc macro author should request this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in theory this is a breaking change. I' would however be surprised if anyone relied on the current behavior, as it doesn't do anything useful. As I commented on #2320, this was the thing I tried because it felt natural and just might have worked – so in effect by adding the attribute, I am opting in to the expansion. What else would a proc macro author expect when adding an attribute that is expanded by a proc macro?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I' would however be surprised if anyone relied on the current behavior, as it doesn't do anything useful.

We should at least crater run it and the theoretical breakage should be added to the text with a reasoning about why you would be surprised and why it doesn't do anything useful.

As I commented on #2320, this was the thing I tried because it felt natural and just might have worked – so in effect by adding the attribute, I am opting in to the expansion. What else would a proc macro author expect when adding an attribute that is expanded by a proc macro?

It could either expand, as per your RFC, to:

mod fun {
    #[flame]
    fn this_is_fun(x: u64) -> u64 { x + 1 }
}

or alternatively:

mod fun {
    fn this_is_fun(x: u64) -> u64 { x + 1 } // This is what I'd expect.
}

this might make a world of difference if #[flame] does transformations to this_is_fun or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added some text to this effect. Please be aware that this is a very language-lawyerly perspective (not that there's anything wrong with that) – as a proc macro author I certainly wouldn't write code to add an attribute that does nothing. In fact, I wrote this RFC because this was what I tried to get macros expanded in flamer and I don't want to wait for a more general solution that may take far more effort.

Anyway, a crater run certainly won't hurt.

@Centril Centril added T-lang Relevant to the language team, which will review and decide on the RFC. A-macros Macro related proposals and issues A-proc-macros Proc macro related proposals & ideas labels Jan 24, 2019
@llogiq llogiq force-pushed the proc-macro-attribute-recursion branch from 5555c94 to a08b6cd Compare January 24, 2019 16:12
@petrochenkov
Copy link
Contributor

For things to work as described three somewhat independent changes are needed:

  • Change expansion order for attributes.
    Currently attributes are expanded in straightforward left-to-right order.
    With the new expansion order each attribute would get an associated "parent expansion" and attributes would be expanded in the order of (parent_expansion, left_to_right) tuple. One possible issue is that expansions are not totally ordered in general, but perhaps it's not an issue for attributes on a single item? Not sure.
  • Change the rules for inert attributes on macro items/expressions/etc.
    Currently inert attributes (and macro attributes for which expansion is delayed in the new expansion order are effectively inert for some time) attached to a macro invocation are lost (kind of cfg-ed out) when the macro is expanded.
    Instead they need to be spread across the expansion results (cloned if necessary), like #[attr] gen_items!() -> #[attr] item1, #[attr] item2, ....
    Note how this is a fundamentally AST-based operation - the macro can't "just" produce tokens (like in the TokenStream model), the tokens must be immediately interpreted as AST fragments to complete expansion of #[attr] gen_items!() and "distribute" the attributes.
  • Include macro invocations into the attribute expansion order.
    Currently #[attr1] ... #[attrN] ITEM means "expand attributes in some order, then proceed to the resulting item(s)". Perhaps the items is a macro, in this case it will be expanded as well after all the attributes.
    Instead the ITEM needs to be included as N + 1th element into the expansion order, get its "parent expansion", and be ordered and expanded using the same rule as its attributes, i.e. (parent_expansion, left_to_right) mentioned above.

TLDR: This is probably not a small and simple change, and it may have implications of the TokenStream model.

"Small and simple change" would be to provide fn fully_expand_item(invocation: TokenStream) -> Result<TokenStream, SomeErrorType> delegating to the current ad hoc eager expansion mechanism in the compiler used for expanding arguments of build-in macros like env!(...).
It would certainly not be stabilize-able as is, and I'm not sure how convenient and usable at all it would be for proc macro authors in practice, but it would be at least something to get experience with, because right now proc macro authors cannot expand their input at all.

@llogiq
Copy link
Contributor Author

llogiq commented Jan 26, 2019

Good point; while this RFC is conceptually simple, the implementation has some subtle details.

  • I don't specify the expansion order beyond requiring that parent expansions be expanded before recursively expanding their result because it is not necessary. Implementing this in a straightforward front-to-back order and keeping a queue of things that need expansion in the future should provide a sufficiently simple implementation of the expansion ordering.
  • It is true that currently inert attributes are removed, however I disagree that changing this is hard. Attributes are just AST nodes, and we can already clone those when necessary. We may choose to implement a MacroExpansion AST node that holds the attributes of the macro expansion to avoid having to clone the attributes, but this is an implementation detail I don't think should be spedified as part of the RFC.
  • Your third point means that we unify macro expansion – where we currently have multiple expansion phases for bang- and attribute macros, those are now interleaved. While I agree with this, every possible solution (such as your fully_expand_item function) shares this trait. I still may want to add some text detailing this change, though.

Also your proposal is incomplete – we'd need multiple functions, for macros in item, expr, pat and ty positions (and I'm not sure if I forgot one). Given this, it is far from simple, too. While I agree that it would indeed be somewhat usable, I'd rather not create such a strong dependency on resolve from expansion. Better to keep the API boundary small.

@petrochenkov
Copy link
Contributor

petrochenkov commented Jan 26, 2019

@llogiq
Regarding

we'd need multiple functions, for macros in item, expr, pat and ty positions

, is expand_item is a necessary minimum because you can wrap everything else into an item and then unwrap - this is what #2320 suggests too.

(The proper solution should probably be a position agnostic TokenStream -> TokenStream function, but the current compiler machinery to which my minimal fully_expand_item is supposed to delegate to hasn't fully migrated from AST to token streams yet and still needs the "item" part.)

@llogiq
Copy link
Contributor Author

llogiq commented Jan 27, 2019

True. Fair enough, if I get to pull this trick, you may too. 😄 I still maintain that this would create an API burden that we will never want to stabilize, whereas my proposal, even if more complex, would be completely forwards-compatible.

Apart from that, I outlined how, given a support library, this can be quite usable from proc_macros, too. I imagine that we might extend proc_macro_rules to do argument expansion automatically.

@Centril Centril self-assigned this Jan 31, 2019
@aturon
Copy link
Member

aturon commented Jan 31, 2019

cc @nrc

@llogiq
Copy link
Contributor Author

llogiq commented Feb 13, 2019

cc @matklad and @jonathandturner who might be reimplementing macro expansion for their respective IDE support projects. What do you think? Would you prefer a magic function that lets proc macros call back into resolve+macro expansion or this recursive method?

@matklad
Copy link
Member

matklad commented Feb 13, 2019

Heh, as an IDE writer, I would prefer neither of those options :-)

The "magic function" approach frightens me, because macro expansion is no longer a pure TokenStream -> TokenStream function, but depends on the global compiler state (for example, on which names are defined in the scope). That I think breaks caching of macro expansions.

This suggestion seems more amendable to caching, probably, but it seems to change how expansion works pretty significantly. The "copy-paste attributes on macros onto expansions" and "the order of expansion of #[attr] m!() depends on how did we get here" seems like a big change.

Could we instead tag macro definitions with #[needs_eager_expansion], which guarantees that the input to this macro will be valid Rust code, free from macro invocations? This is probably covered in some comment somewhere, but a quick glance to the alternatives section hasn't answered the question.

@llogiq
Copy link
Contributor Author

llogiq commented Feb 13, 2019

@matklad so this tag would also apply to proc_macro_attributes? That would make my suggestion from above feasible: we could use a proc_macro_attribute to have 'inner' macro invocations expanded (or perhaps we can do this in general, but there may be some problems with expansion order).

Unless there is some snag I'm overlooking, I'm totally for it!

@llogiq
Copy link
Contributor Author

llogiq commented Feb 13, 2019

@matklad thinking a bit more about it, wouldn't that be functionally equivalent to my proposal? What would happen if I create a proc_macro that returns a macro by example invocation to be eagerly expanded that will force-expand the innards and call-back a proc-macro?

@matklad
Copy link
Member

matklad commented Feb 14, 2019

I have a fuzzy understanding of proc macros, so I don't really know what exactly either proposal means, but I think "eager expand all inputs" is different on the implementation side in that it

  • doesn't reimplementer POV thequire to change ordering of macro expansion
  • does not need "macro attribute on macro invocation" thing

From the macro author POV, I think this is less flexible: you can't selectively process some macro invocations as token trees and others as expanded code, it's all or nothing.

@llogiq
Copy link
Contributor Author

llogiq commented Feb 14, 2019

True, that is a difference. However, for my use cases your suggestion works, too. @petrochenkov what do you think? Should we close this RFC and setup another? @matklad do you have time to write up a new RFC or should I do it?

@matklad
Copy link
Member

matklad commented Feb 14, 2019

I don't have neither time, nor the required knowledge here :-)

@llogiq
Copy link
Contributor Author

llogiq commented Feb 16, 2019

@petrochenkov you know more about the current macro expansion implementation – how complex would you rate @matklad 's suggestion?

@joshtriplett
Copy link
Member

We talked about this in today's @rust-lang/lang meeting. We're generally in favor of the idea, if it turns out to be reasonably feasible and performant to implement in the compiler; that's where our primary concerns are with this, because that expertise isn't widely available. For that reason, we'd be willing to approve a project group for this, if that project group were willing to help see this through to completion.

@Diggsey
Copy link
Contributor

Diggsey commented Sep 4, 2020

So, I actually meant to suggest this here, but I accidentally commented on the original RFC instead...

Here's my idea:

  1. Macros are categorised into those that receive their input post-expansion, and those that (like all current macros) don't.
  2. The procedural macro attribute gains the ability to specify the category, eg. #[proc_macro(expand_input)] vs just #[proc_macro]
  3. In the rare case that a macro wants to expand parts of its input (eg. maybe it accepts syntax that's not valid Rust) a recursive solution similar to what's proposed in this RFC can be used:
    • The user defines two macros, one of which is marked as "expands input"
    • The other macro expands to the generated code, plus some calls to the first macro when the user wants to do some additional work post-expansion.

IMO, this is simpler to understand, and the recursion part does not require any special-casing - it's simply an expected by-product of having two categories of macros. Rustdoc could even automatically document whether a particular macro is pre-expanded or not.

@jgarvin
Copy link

jgarvin commented Sep 4, 2020

Reading through the long comment chain here it's hard for me to tell -- does requiring the input to be expanded first mean that any macro invocations that could not be expanded in the input because a definition was not available cause an error? The idea being that if it weren't an error, the macro taking the expanded-as-far-as-possible input could then come up with its own substitutions for the remaining invocations. This would make it possible to write macros that expand differently depending on the context in which they are invoked.

For example, I have been fiddling around with a port of the C++ catch unit test library to rust. In the library there is a macro called #[catch_section] that should really only have a meaning when used inside a function that has been annotated with #[catch_test_case]. Currently there is no good way for me to enforce this. But if I could make it so that #[catch_test_case] eagerly expanded its input, then I could just not provide a normal definition of #[catch_section], and instead manually look for it in the expanded input and do the intended substitution. Then #[catch_section] would only work inside functions annotated with #[catch_test_case]. I could try and do something like this now without the input being eagerly expanded, but then if a user defines a macro that expands to contain a #[catch_section] my #[catch_test_case] implementation would fail to find it because currently it sees the unexpanded input.

@matu3ba
Copy link

matu3ba commented Sep 4, 2020

So, I actually meant to suggest this here, but I accidentally commented on the original RFC instead...

Here's my idea:

1. Macros are categorized into those that receive their input post-expansion, and those that (like all current macros) don't.

What do you mean by input? Variables? Code that is constant during compile time?
Or do you want to go full-blown by expressing the macro expansion order as tree logic by annotating what parts are "compile-time-constant" and which are not?

2. The procedural macro attribute gains the ability to specify the category, eg. `#[proc_macro(expand_input)]` vs just `#[proc_macro]`

Bikeshed: Same reasoning. Input not specified.

3. In the rare case that a macro wants to expand _parts_ of its input (eg. maybe it accepts syntax that's not valid Rust) a recursive solution similar to what's proposed in this RFC can be used:
   
   * The user defines two macros, one of which is marked as "expands input"
   * The other macro expands to the generated code, plus some calls to the first macro when the user wants to do some additional work post-expansion.

What type of additional work? Macro number computations (macroeval) and comparison on input for filtering?

@Diggsey
Copy link
Contributor

Diggsey commented Sep 4, 2020

What do you mean by input? Variables? Code that is constant during compile time?

Macros are a compile-time construct that take a TokenStream as input and produce a TokenStream as output.

Or do you want to go full-blown by expressing the macro expansion order as tree logic by annotating what parts are "compile-time-constant" and which are not?

I'm not sure what you mean by this. I'm proposing a simple opt-in to pre-expand macros in the input TokenStream.

For example, let's say we have a procedural macro foo!, a macro unexpanded!() that expands to the function expanded() and the following code is compiled:

foo!(unexpanded!())

Today, foo! will be passed the TokenStream unexpanded!() as its input. What I'm proposing, is that if foo! is declared with #[proc_macro(expand_input)] it will instead be passed expanded() as its input.

Copy link
Contributor

@petrochenkov petrochenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Comment to get rid or the pending review request, GitHub doesn't support rejecting review requests in any other way.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Macro related proposals and issues A-proc-macros Proc macro related proposals & ideas T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.