Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macros 2.0: Span::def_site() vs Span::call_site() #45934

Closed
alexcrichton opened this issue Nov 11, 2017 · 10 comments
Closed

Macros 2.0: Span::def_site() vs Span::call_site() #45934

alexcrichton opened this issue Nov 11, 2017 · 10 comments
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-2.0 Area: Declarative macros 2.0 (#39412) C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

Up until recently I've considered these two function calls in proc_macro, Span::default() and Span::call_site() relatively different. I'm not realizing, however, that they're actually quite significantly different depending on what you're doing in a procedural macro!

In working with the gnome-class macro we've ended up getting a good deal more experience with the procedural macro system. This macro is using dependencies like quote, syn, and proc-macro2 to parse and generate code. The code itself contains a mixture of modules and macro_rules-like macro expansions.

When we tried to enable the unstable feature in proc-macro2, which switches it to use the "real" proc_macro APIs and preserve span information, it turned out everything broke! When digging into this I found that everything we were experiencing was related to the distinction between the default and call_site functions.

So as a bit of background, the gnome_class! macro uses a few methods to manufacture a TokenStream. Primarily it uses the quote! macro from the quote crate, which primarily at this time uses parse for most tokens to generate a TokenTree. Namely part of the quote! macro will call stringify! on each token to get reparsed at runtime and turned into a list of tokens. For delimiters and such the quote crate currently creates a default span.

Additionally both @federicomenaquintero and I were novices at the procedural macro/hygiene/span systems, so we didn't have a lot of info coming in! Now though we think we're a bit more up to speed :). The rest of this issue will be focused on "weird errors" that we had to work backwards to figure out how to solve. This all, to me at least, seems like a blocker for stabilization in the sense that I wouldn't wish this experience on others.

I'm not really sure what the conclusions from this would be though. The behavior below may be bugs in the compiler or bugs in the libraries we're using, I'm not sure! I'll write up some thoughts at the end though. In general though I wanted to just detail all that we went through in discovering this and showing how the current behavior that we have ends up being quite confusing.

Examples of odd errors

In any case, I've created an example project showcasing a number of the "surprises" (some bugs?) that we ran into. I'll try to recatalog them here:

Using parse breaks super

The first bug that was found was related to generating a program that looked like:

mod foo {
    use super::*;
}

It turns out that if you use parse to generate the token super it doesn't work! If you set the span of super to default, however, it does indeed work.

I was pretty surprised by this (and the odd error messages). I'm not really sure why the parse span was not allowing it to resolve, but I imagine it was related to hygiene? I submitted dtolnay/quote#51 which I think might fix this but I wasn't sure if that was the right fix...

Is that the right fix for quote? Should it be using default wherever it can? I originally though that but then ran into...

Using Span::default means you can't import from yourself

This second bug was found relating to the program that looks like:

struct A;
mod foo {
    use super::A;
}

Here we have a failing procedural macro despite the usage of Span::default on all tokens. This means that by default all modules generated via quote!, if we were to switch spans to Span::default, would not be able to import from one another it looks like? But maybe this is only related to super? I'm not quite sure..

It also turns out that this does indeed work if we use Span::call_site by default everywhere. I'm not really sure why, but it apparently works!

Using Span::default means you can't import generated structs

Next up we had a bug related to:

pub struct A;

It turns out that if these tokens are using Span::default this can't actually be used! In this test case you get an error about an unresolved import.

Like with before though if we use call_site as a span everywhere this case does indeed work.

Is this expected? This means, I think, that all tokens with a Default span can't be improted outside of the procedural macro.

Using Span::default means you can't use external crates

Next we took a look at a program like:

use std::mem;

When generating these tokens with Span::default it turns out that this becomes an unresolved import! That is, the usage of Span::default seems like it's putting it in an entirely new namespace without access to even std at the root. Coming from the perspective of not knowing a lot about spans/hygiene I found this a little confusing :)

As with the previous and remaining cases using call_site as a span does indeed get this working.

Naturally the error message was pretty confusing here, but I guess this is expected? Hygiene I think necessitates this? Unsure...

Using Span::default means you can't import from yourself

Next up we have a program like

use foo::*;
mod foo {
}

Here if we use Span::default everywhere this program will not compile with the import becoming unresolved. For us this seemed to imply that if we generated new modules in a macro we basically can't use imports!

As per usual respanning with call_site everywhere fixes this but I'd imagine has different hygiene implications. I'm not sure if this behavior was intended, although it seemed like it may be a bug in rustc?

Using Span::default precludes working with "non hygienic macros"

This is a particularly interesting use case. The gnome_class! procedural macro internally delegates to the glib_wrapper! macro_rules macro in the expanded tokens. The glib_wrapper! macro, however, in its current state does not work in an empty module but rather requires imports like std::ptr in the environment. With Span::default, however, the generated tokens in glib_wrapper! couldn't see the tokens we generated with gnome_class!.

For example if in one crate we have a macro like

#[macro_export]
macro_rules! a {
    ($a:ident) => (
        fn _bar() {
            mem::drop(3);
        }
    )
}

(note that this requires std::mem to be imported to work)

and then we're generating a token stream that looks like:

mod foo {
    extern crate std;
    use self::std::mem;
    a! {}

Note that the extern crate is necessary due to one of the above situations (we can't import from the top-level extern crate). Here though if we generated tokens with Span::default as with many other cases this doesn't work! As usual if we respan with call_site spans then this does indeed work.

Is this a bug? Or maybe more hygiene?

Conclusions

Overall for our use case we found that 100% of the time we should be using Span::call_site instead of Span::default to get things working. Whether or not that's what we wanted hygienically we're not sure! I couldn't really understand the hygiene implications here because tokens using Span::default couldn't import from other modules defined next to it with the default span as well.

Should quote and syn move to using Span::call_site by default instead of Span::default? Or maybe Span::default should be renamed to sound "less default" if it appears to not work most of the time? Or maybe Span::default has bugs that need fixing?

I'm quite curious to hear what others think! Especially those that are particularly familiar with hygiene/macros, I'd love to hear ideas about whether this is expected behavior (and if so if we could maybe improve the error messages) or if we should perhaps be structuring the macro expansion differently. Similarly what would recommendations be for spanning tokens returned by quote! in an external crate? Or syn? (for example if I maufacture an Ident, is there a "good default" for that?)

In any case, curious to hear others' thoughts!


cc @jseyfried
cc @nrc
cc @nikomatsakis
cc @federicomenaquintero
cc @dtolnay
cc @mystor

@alexcrichton alexcrichton added the A-macros-2.0 Area: Declarative macros 2.0 (#39412) label Nov 11, 2017
alexcrichton added a commit to alexcrichton/quote that referenced this issue Nov 11, 2017
@federicomenaquintero
Copy link
Contributor

Without knowing all the implications of hygiene, it sounds like we could use two macros:

quote_unhygienically! { ... }
quote_hygienically! { ... }

The fist/unhygienic one is for "I want to paste this manufactured code into the user's namespace". Gnome-class does a lot of that for the GObject boilerplate. It creates a couple of modules too keep things contained and exports things from them. It creates user-visible structs with manufactured identifiers for names: e.g. the user types "class foo" and we make up a "FooClass" struct type.

The second/hygienic one is for "this is an implementation detail of my macro, wherein I don't want names to clash with the user's". For example, internally gnome-class creates a PRIV static struct as an implementation detail. Overall the generated code looks like

pub mod FooMod {    // synthesized from the "class Foo" that the user typed
    ... public Rust API ...

    pub struct Foo {
        ... generated from the class that the user typed ...
    };

    pub mod imp { // internals, implementation
        ... gobject boilerplate ...

        static mut PRIV: FooClassPrivate = FooClassPrivate {
            ... zeros, later set with ONCE.call_once() ...
        }

        use super::super::*; // bring in the user's namespace, for their implementations

        impl super::Foo { // implement the *public* API here
            pub fn foobarize(&self, ...) {
                ... implementation method pasted from the user's code ... 
            }
        }
    }
}
pub use FooMod::*;

Imagine that Foo::foobarize() is a method that the user specified inside the call to gobject_gen!{}. Further, imagine that the user's implementation happens to have a PRIV identifier. It would be desirable for this identifier not to clash with the internal PRIV from the example code above, i.e. for that bit of gobject_gen's implementation to be hygienic. The parts of the generated code which are intended to be public API would be unhygienic, "I just want this code pasted here".

Probably quote_hygienically! and quote_unhygienically!() are not the nicest names. I'm also not sure how to explain these concepts to implementors of procedural macros. Also, whether their implementations actually use Span::default() or Span::call_site()... well, that's an implementation detail of the macros themselves :)

@jseyfried
Copy link
Contributor

jseyfried commented Nov 15, 2017

@alexcrichton Thanks for the feedback!

Or maybe Span::default should be renamed to sound "less default" if it appears to not work most of the time?

Making it Default was probably a mistake; opened #45999 to rename to Span::def_site.

Span::def_site() is the hygienic "opposite" of Span::call_site() -- the former is a "default" span that resolves at the macro definition, and the latter is a "default" span that resolves at the macro invocation.

Using parse breaks super

For backwards compatibility with macros 1.1, the tokens from parse need to resolve at the macro invocation. Consider the hygienically equivalent declarative macro:

#![feature(decl_macro)]

macro m($i: ident) { // The token that substitutes `$i` resolves at the macro invocation
    mod foo {
        use $i::*;
    }
}

m!(super); // When resolved here, "super" doesn't make sense since the crate root has no parent.

In other words, since the module foo is hygienic (i.e. it is at the def site), it can only be seen when resolving at the macro definition and is irrelevant to the resolution of the super.

If you give the mod foo a span from the macro invocation, it will instead only be relevant when resolving at the macro invocation. Thus, the super from parse will resolve as you expect, and a super with Span::def_site() would resolve to whatever super is at the root of the macro definition.

Using Span::default means you can't import from yourself

This is actually "just" another manifestation of this:

fn f() {
    mod foo {
        pub fn g();
    }
    use ??::foo::g; // How to import `g`?
}

In your example*, super gets resolved at the macro definition. self at the macro definition here would resolve to mod foo, so super resolves to the parent of foo at the macro definition, i.e. proc-macro crate root.

*reproduced for convenience:

struct A;
mod foo { use super::A; }

I have seen fn proposed for import inside a function. Perhaps we could use macro analagously?

pub fn g() {} // (1)
macro m() {
    pub fn g() {} // (2)

    mod bar {
        pub fn g() {} // (3)
        use self::g as g3; // Resolves to (3) today
        use super::g as g1; // Resolves to (1) today
        use macro::g as g2; // Would resolve to (2)
    }
}

// ...

m!(); //^ The above statements are true wherever `m!()` is invoked.

An alternative would be to allow self/super to resolve to a procedural and/or declarative macro
definition, so that use super::g; from the above example would resolve to (2).
The main downside of this is asymmetry with normal fns.

Using Span::default means you can't import generated structs

If a struct has a span at the macro definition, it will only be relevant to things that resolve at the macro definition. For example,

macro m() {
    pub struct A; // (1)
    A; // resolves to (1)
}

fn f() {
    m!();
    struct A; // (2)
    A; // resolves to (2)
}

With declarative macros, a solution is to add an argument:

macro m($i:ident) {
    pub struct $i; // (1)
    $i; // resolves to (1)
    A; //~ ERROR A is not in scope
}

fn f() {
    m!(A);
    A; // resolves to (1)
}

With procedural macros, a solution is to use a span at the macro invocation.

Using Span::default means you can't use external crates

Since the only phase 1 items in a proc-macro crate (i.e. the only items usable when compiling for the target architecture) are proc macros, there aren't any extern crates at the crate root when resolving
at a proc macro definition site.

You can use external crates, but you have to declare them in each macro expansion.

If you want to use an extern crate, you need to declare it in a submodule in each macro expansion, analogous to the following:

// Say I want to use `extern crate foo` in this function without declaring it anywhere outside.
fn f() {
    extern crate foo; // I could declare it here, but I can't `use` it (yet...)
    mod foo {
        extern crate foo; // If I want `use`, I need to declare it in a module.
    }
}

While use macro::*; as discussed earlier ameliorates this somewhat, it's still a serious ergonomics issue.

The solution is to support adding "phase 1" (target architecture) items to proc macro crates.
The first (and maybe only) step here is to allow proc-macro crates to have target dependencies.

Then,

#[phase_1] // With `extern crate` disappearing, we may never introduce this...
extern crate foo; // (1) This declares a target dependency `foo`

#[proc_macro]
fn m(input: TokenStream) -> TokenStream {
    use foo; // This doesn't resolve
    quote! { // n.b. `quote` creates tokens with `Span::def_site()`.
        use foo; // This resolves to (1)
    }
}

In the near-near (day or two) term, we could add an implicit #[phase_1] extern crate std; to every proc-macro crate so that generating use std::... would always work.

Using Span::default precludes working with "non hygienic macros"

Yeah, this is a limitation. I believe the interaction in this case can be improved, but at the cost of some (maybe significant) implementation complexity. If someone is interested in working on this, I can mentor / write notes. Fixed in #46551.

That being said, ideally people would migrate declarative and procedural macros at the same time where possible to minimize macros 1.0/2.0 interaction. While I tried to make the interaction as nice as possible, backwards compatability and various "sanity-check" coherence conditions bound how nice it can be.

Overall for our use case we found that 100% of the time we should be using Span::call_site instead of Span::default to get things working. Whether or not that's what we wanted hygienically we're not sure!

Generally speaking (and summarizing @federicomenaquintero), if a name is part of the "public API" of the macro (i.e. the macro defines the name for the invoker to use, the macro requires the name to be in scope at the invocation, etc.), the span should be at the macro invocation. Otherwise, the span should be at the macro definition.

Should quote and syn move to using Span::call_site by default instead of Span::default?

Ideally quote and syn would let you specify a span so the user could decide.

@federicomenaquintero

Today, quote! is your quote_hygienically!; to quote_unhygienically! { ... }, you need to deep_replace_spans(quote! { ... }, Span::call_site()); (and write deep_replace_spans(TokenStream, Span) -> TokenStream).

IIRC @SergioBenitez has proposed spanned_quote!(span, { ... }) s.t. quote_hygienically! { ... } is spanned_quote!(Span::def_site(), { ... }) and quote_unhygienically! { ... } is spanned_quote!(Span::call_site(), { ... }) (wrt hygiene, at least).

@federicomenaquintero
Copy link
Contributor

Thanks, @jseyfried, this is very enlightening. I like the idea of spanned_quote!(). With that in place, I assume the plain quote!{} would be removed?

Maybe the following is more of a gnome-class issue than something directly for this issue, but anyway:

In gnome-class we are given a name from the user, say Foo, and generate some other identifiers from it:

        let container_name = |suffix: &str| {
            // InstanceName is an Ident; comes from the caller
            let mut i = Ident::from(format!("{}{}", InstanceName.as_ref(), suffix));
            i.span.0 = Span::call_site();
            return i
        };

        let InstanceNameFfi  = container_name("Ffi");
        let ModuleName       = container_name("Mod");
        let ClassName        = container_name("Class");
        let PrivateClassName = container_name("ClassPrivate");
        let InstanceExt      = container_name("Ext");

Then we use the generated InstanceNameFfi, ClassName, etc. throughout the macro. A couple of those are intended to be visible by the user.

However, PrivateClassName is not intended to be visible - that one should be hygienic.

Right now we depend on @alexcrichton's hacked quote crate, which generates Span::call_site() unconditionally. The container_name lambda above explicitly sets the span of the generated identifier to call_site, too. For the names I need to be hygienic, I guess I can have my own function that doesn't set the spans on the synthesized Idents.

If we had spanned_quote!(), which I assume would do your deep_replace_spans(), wouldn't that change the spans of all my synthesized idents, though?

@jseyfried
Copy link
Contributor

jseyfried commented Nov 15, 2017

@federicomenaquintero

With that in place, I assume the plain quote! {} would be removed?

Perhaps, I think it could still be nice to have quote! {} by itself though, so that e.g.

macro m($($input:tt)*) {
    tokens ... $($input)* ...
}
// is equivalent to
#[proc_macro]
fn m(input: TokenStream) -> TokenStream {
    quote! { tokens ... $input ... }
}

If we had spanned_quote!(), which I assume would do your deep_replace_spans(), wouldn't that change the spans of all my synthesized idents, though?

Good question, I forgot to mention this -- the span in spanned_quote! would only apply to quoted tokens; any interpolated/unquoted tokens would retain their spans. This is how quote! behaves today (otherwise, the above equivalence wouldn't hold).

This means we (I) should prioritize adding spanned_quote! -- quote!+deep_replace_spans isn't sufficient.

bors added a commit that referenced this issue Nov 18, 2017
Rename `Span::default` -> `Span::def_site`

I think the explicitness here is warranted.
c.f. #45934
r? @nrc
@alexcrichton
Copy link
Member Author

Thanks so much for the explanation @jseyfried!

I definitely agree that spanned_quote! is probably needed here, but I'd also just like to confirm, do you agree that quote! should exist and use Span::def_site by default?

@jseyfried
Copy link
Contributor

@alexcrichton No prob! Yes, I agree that quote! should exist and use Span::def_site (or something with the same hygiene but better diagnostics) by default.

@alexcrichton alexcrichton added the A-diagnostics Area: Messages for errors, warnings, and lints label Nov 21, 2017
@alexcrichton
Copy link
Member Author

According to @jseyfried's epic comment it sounds like everything here is working as intended. As a result I'm going to classify this as a diagnostics bug as I think we can probably do better in terms of what diagnostics are printed by the compiler in these situations.

In the meantime I've opened dtolnay/quote#55 for the quote crate in the ecosystem so gnome-class can get off a fork :)

@jseyfried
Copy link
Contributor

jseyfried commented Dec 7, 2017

Using Span::default precludes working with "non hygienic macros"

Yeah, this is a limitation. I believe the interaction in this case can be improved, but at the cost of some (maybe significant) implementation complexity.

Fixing this was more straightforward than I thought; implemented in #46551.

@alexcrichton after #46551 lands, the example in your original comment will work.

Generally speaking, if you invoke an unhygienic macro at a hygienic macro's def site (e.g. quote! it into existence), expanded tokens from the unhygienic macro definition will resolve at the hygienic macro's def site (i.e. they can see other quote!ed things in the hygienic macro).

@alexcrichton
Copy link
Member Author

@jseyfried awesome!

@Enselic
Copy link
Member

Enselic commented Sep 24, 2023

Triage: Closing as fixed because

  • it seems fixed based on the last three comments.
  • There is a tracking issue for Span::def_site() now.

@Enselic Enselic closed this as completed Sep 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-2.0 Area: Declarative macros 2.0 (#39412) C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants