Macros 2.0: Span::def_site() vs Span::call_site() #45934

alexcrichton · 2017-11-11T15:42:31Z

Up until recently I've considered these two function calls in proc_macro, Span::default() and Span::call_site() relatively different. I'm not realizing, however, that they're actually quite significantly different depending on what you're doing in a procedural macro!

In working with the gnome-class macro we've ended up getting a good deal more experience with the procedural macro system. This macro is using dependencies like quote, syn, and proc-macro2 to parse and generate code. The code itself contains a mixture of modules and macro_rules-like macro expansions.

When we tried to enable the unstable feature in proc-macro2, which switches it to use the "real" proc_macro APIs and preserve span information, it turned out everything broke! When digging into this I found that everything we were experiencing was related to the distinction between the default and call_site functions.

So as a bit of background, the gnome_class! macro uses a few methods to manufacture a TokenStream. Primarily it uses the quote! macro from the quote crate, which primarily at this time uses parse for most tokens to generate a TokenTree. Namely part of the quote! macro will call stringify! on each token to get reparsed at runtime and turned into a list of tokens. For delimiters and such the quote crate currently creates a default span.

Additionally both @federicomenaquintero and I were novices at the procedural macro/hygiene/span systems, so we didn't have a lot of info coming in! Now though we think we're a bit more up to speed :). The rest of this issue will be focused on "weird errors" that we had to work backwards to figure out how to solve. This all, to me at least, seems like a blocker for stabilization in the sense that I wouldn't wish this experience on others.

I'm not really sure what the conclusions from this would be though. The behavior below may be bugs in the compiler or bugs in the libraries we're using, I'm not sure! I'll write up some thoughts at the end though. In general though I wanted to just detail all that we went through in discovering this and showing how the current behavior that we have ends up being quite confusing.

Examples of odd errors

In any case, I've created an example project showcasing a number of the "surprises" (some bugs?) that we ran into. I'll try to recatalog them here:

Using `parse` breaks `super`

The first bug that was found was related to generating a program that looked like:

mod foo {
    use super::*;
}

It turns out that if you use parse to generate the token super it doesn't work! If you set the span of super to default, however, it does indeed work.

I was pretty surprised by this (and the odd error messages). I'm not really sure why the parse span was not allowing it to resolve, but I imagine it was related to hygiene? I submitted dtolnay/quote#51 which I think might fix this but I wasn't sure if that was the right fix...

Is that the right fix for quote? Should it be using default wherever it can? I originally though that but then ran into...

Using `Span::default` means you can't import from yourself

This second bug was found relating to the program that looks like:

struct A;
mod foo {
    use super::A;
}

Here we have a failing procedural macro despite the usage of Span::default on all tokens. This means that by default all modules generated via quote!, if we were to switch spans to Span::default, would not be able to import from one another it looks like? But maybe this is only related to super? I'm not quite sure..

It also turns out that this does indeed work if we use Span::call_site by default everywhere. I'm not really sure why, but it apparently works!

Using `Span::default` means you can't import generated structs

Next up we had a bug related to:

pub struct A;

It turns out that if these tokens are using Span::default this can't actually be used! In this test case you get an error about an unresolved import.

Like with before though if we use call_site as a span everywhere this case does indeed work.

Is this expected? This means, I think, that all tokens with a Default span can't be improted outside of the procedural macro.

Using `Span::default` means you can't use external crates

Next we took a look at a program like:

use std::mem;

When generating these tokens with Span::default it turns out that this becomes an unresolved import! That is, the usage of Span::default seems like it's putting it in an entirely new namespace without access to even std at the root. Coming from the perspective of not knowing a lot about spans/hygiene I found this a little confusing :)

As with the previous and remaining cases using call_site as a span does indeed get this working.

Naturally the error message was pretty confusing here, but I guess this is expected? Hygiene I think necessitates this? Unsure...

Using `Span::default` means you can't import from yourself

Next up we have a program like

use foo::*;
mod foo {
}

Here if we use Span::default everywhere this program will not compile with the import becoming unresolved. For us this seemed to imply that if we generated new modules in a macro we basically can't use imports!

As per usual respanning with call_site everywhere fixes this but I'd imagine has different hygiene implications. I'm not sure if this behavior was intended, although it seemed like it may be a bug in rustc?

Using `Span::default` precludes working with "non hygienic macros"

This is a particularly interesting use case. The gnome_class! procedural macro internally delegates to the glib_wrapper! macro_rules macro in the expanded tokens. The glib_wrapper! macro, however, in its current state does not work in an empty module but rather requires imports like std::ptr in the environment. With Span::default, however, the generated tokens in glib_wrapper! couldn't see the tokens we generated with gnome_class!.

For example if in one crate we have a macro like

#[macro_export]
macro_rules! a {
    ($a:ident) => (
        fn _bar() {
            mem::drop(3);
        }
    )
}

(note that this requires std::mem to be imported to work)

and then we're generating a token stream that looks like:

mod foo {
    extern crate std;
    use self::std::mem;
    a! {}

Note that the extern crate is necessary due to one of the above situations (we can't import from the top-level extern crate). Here though if we generated tokens with Span::default as with many other cases this doesn't work! As usual if we respan with call_site spans then this does indeed work.

Is this a bug? Or maybe more hygiene?

Conclusions

Overall for our use case we found that 100% of the time we should be using Span::call_site instead of Span::default to get things working. Whether or not that's what we wanted hygienically we're not sure! I couldn't really understand the hygiene implications here because tokens using Span::default couldn't import from other modules defined next to it with the default span as well.

Should quote and syn move to using Span::call_site by default instead of Span::default? Or maybe Span::default should be renamed to sound "less default" if it appears to not work most of the time? Or maybe Span::default has bugs that need fixing?

I'm quite curious to hear what others think! Especially those that are particularly familiar with hygiene/macros, I'd love to hear ideas about whether this is expected behavior (and if so if we could maybe improve the error messages) or if we should perhaps be structuring the macro expansion differently. Similarly what would recommendations be for spanning tokens returned by quote! in an external crate? Or syn? (for example if I maufacture an Ident, is there a "good default" for that?)

In any case, curious to hear others' thoughts!

cc @jseyfried
cc @nrc
cc @nikomatsakis
cc @federicomenaquintero
cc @dtolnay
cc @mystor

The text was updated successfully, but these errors were encountered:

cc rust-lang/rust#45934

federicomenaquintero · 2017-11-11T16:47:58Z

Without knowing all the implications of hygiene, it sounds like we could use two macros:

quote_unhygienically! { ... }
quote_hygienically! { ... }

The fist/unhygienic one is for "I want to paste this manufactured code into the user's namespace". Gnome-class does a lot of that for the GObject boilerplate. It creates a couple of modules too keep things contained and exports things from them. It creates user-visible structs with manufactured identifiers for names: e.g. the user types "class foo" and we make up a "FooClass" struct type.

The second/hygienic one is for "this is an implementation detail of my macro, wherein I don't want names to clash with the user's". For example, internally gnome-class creates a PRIV static struct as an implementation detail. Overall the generated code looks like

pub mod FooMod {    // synthesized from the "class Foo" that the user typed
    ... public Rust API ...

    pub struct Foo {
        ... generated from the class that the user typed ...
    };

    pub mod imp { // internals, implementation
        ... gobject boilerplate ...

        static mut PRIV: FooClassPrivate = FooClassPrivate {
            ... zeros, later set with ONCE.call_once() ...
        }

        use super::super::*; // bring in the user's namespace, for their implementations

        impl super::Foo { // implement the *public* API here
            pub fn foobarize(&self, ...) {
                ... implementation method pasted from the user's code ... 
            }
        }
    }
}
pub use FooMod::*;

Imagine that Foo::foobarize() is a method that the user specified inside the call to gobject_gen!{}. Further, imagine that the user's implementation happens to have a PRIV identifier. It would be desirable for this identifier not to clash with the internal PRIV from the example code above, i.e. for that bit of gobject_gen's implementation to be hygienic. The parts of the generated code which are intended to be public API would be unhygienic, "I just want this code pasted here".

Probably quote_hygienically! and quote_unhygienically!() are not the nicest names. I'm also not sure how to explain these concepts to implementors of procedural macros. Also, whether their implementations actually use Span::default() or Span::call_site()... well, that's an implementation detail of the macros themselves :)

jseyfried · 2017-11-15T06:23:08Z

@alexcrichton Thanks for the feedback!

Or maybe Span::default should be renamed to sound "less default" if it appears to not work most of the time?

Making it Default was probably a mistake; opened #45999 to rename to Span::def_site.

Span::def_site() is the hygienic "opposite" of Span::call_site() -- the former is a "default" span that resolves at the macro definition, and the latter is a "default" span that resolves at the macro invocation.

Using parse breaks super

For backwards compatibility with macros 1.1, the tokens from parse need to resolve at the macro invocation. Consider the hygienically equivalent declarative macro:

#![feature(decl_macro)]

macro m($i: ident) { // The token that substitutes `$i` resolves at the macro invocation
    mod foo {
        use $i::*;
    }
}

m!(super); // When resolved here, "super" doesn't make sense since the crate root has no parent.

In other words, since the module foo is hygienic (i.e. it is at the def site), it can only be seen when resolving at the macro definition and is irrelevant to the resolution of the super.

If you give the mod foo a span from the macro invocation, it will instead only be relevant when resolving at the macro invocation. Thus, the super from parse will resolve as you expect, and a super with Span::def_site() would resolve to whatever super is at the root of the macro definition.

Using Span::default means you can't import from yourself

This is actually "just" another manifestation of this:

fn f() {
    mod foo {
        pub fn g();
    }
    use ??::foo::g; // How to import `g`?
}

In your example*, super gets resolved at the macro definition. self at the macro definition here would resolve to mod foo, so super resolves to the parent of foo at the macro definition, i.e. proc-macro crate root.

*reproduced for convenience:

struct A;
mod foo { use super::A; }

I have seen fn proposed for import inside a function. Perhaps we could use macro analagously?

pub fn g() {} // (1)
macro m() {
    pub fn g() {} // (2)

    mod bar {
        pub fn g() {} // (3)
        use self::g as g3; // Resolves to (3) today
        use super::g as g1; // Resolves to (1) today
        use macro::g as g2; // Would resolve to (2)
    }
}

// ...

m!(); //^ The above statements are true wherever `m!()` is invoked.

An alternative would be to allow self/super to resolve to a procedural and/or declarative macro
definition, so that use super::g; from the above example would resolve to (2).
The main downside of this is asymmetry with normal fns.

Using Span::default means you can't import generated structs

If a struct has a span at the macro definition, it will only be relevant to things that resolve at the macro definition. For example,

macro m() {
    pub struct A; // (1)
    A; // resolves to (1)
}

fn f() {
    m!();
    struct A; // (2)
    A; // resolves to (2)
}

With declarative macros, a solution is to add an argument:

macro m($i:ident) {
    pub struct $i; // (1)
    $i; // resolves to (1)
    A; //~ ERROR A is not in scope
}

fn f() {
    m!(A);
    A; // resolves to (1)
}

With procedural macros, a solution is to use a span at the macro invocation.

Using Span::default means you can't use external crates

Since the only phase 1 items in a proc-macro crate (i.e. the only items usable when compiling for the target architecture) are proc macros, there aren't any extern crates at the crate root when resolving
at a proc macro definition site.

You can use external crates, but you have to declare them in each macro expansion.

If you want to use an extern crate, you need to declare it in a submodule in each macro expansion, analogous to the following:

// Say I want to use `extern crate foo` in this function without declaring it anywhere outside.
fn f() {
    extern crate foo; // I could declare it here, but I can't `use` it (yet...)
    mod foo {
        extern crate foo; // If I want `use`, I need to declare it in a module.
    }
}

While use macro::*; as discussed earlier ameliorates this somewhat, it's still a serious ergonomics issue.

The solution is to support adding "phase 1" (target architecture) items to proc macro crates.
The first (and maybe only) step here is to allow proc-macro crates to have target dependencies.

Then,

#[phase_1] // With `extern crate` disappearing, we may never introduce this...
extern crate foo; // (1) This declares a target dependency `foo`

#[proc_macro]
fn m(input: TokenStream) -> TokenStream {
    use foo; // This doesn't resolve
    quote! { // n.b. `quote` creates tokens with `Span::def_site()`.
        use foo; // This resolves to (1)
    }
}

In the near-near (day or two) term, we could add an implicit #[phase_1] extern crate std; to every proc-macro crate so that generating use std::... would always work.

Using Span::default precludes working with "non hygienic macros"

Yeah, this is a limitation. ~~I believe the interaction in this case can be improved, but at the cost of some (maybe significant) implementation complexity. If someone is interested in working on this, I can mentor / write notes.~~ Fixed in #46551.

That being said, ideally people would migrate declarative and procedural macros at the same time where possible to minimize macros 1.0/2.0 interaction. While I tried to make the interaction as nice as possible, backwards compatability and various "sanity-check" coherence conditions bound how nice it can be.

Overall for our use case we found that 100% of the time we should be using Span::call_site instead of Span::default to get things working. Whether or not that's what we wanted hygienically we're not sure!

Generally speaking (and summarizing @federicomenaquintero), if a name is part of the "public API" of the macro (i.e. the macro defines the name for the invoker to use, the macro requires the name to be in scope at the invocation, etc.), the span should be at the macro invocation. Otherwise, the span should be at the macro definition.

Should quote and syn move to using Span::call_site by default instead of Span::default?

Ideally quote and syn would let you specify a span so the user could decide.

@federicomenaquintero

Today, quote! is your quote_hygienically!; to quote_unhygienically! { ... }, you need to deep_replace_spans(quote! { ... }, Span::call_site()); (and write deep_replace_spans(TokenStream, Span) -> TokenStream).

IIRC @SergioBenitez has proposed spanned_quote!(span, { ... }) s.t. quote_hygienically! { ... } is spanned_quote!(Span::def_site(), { ... }) and quote_unhygienically! { ... } is spanned_quote!(Span::call_site(), { ... }) (wrt hygiene, at least).

federicomenaquintero · 2017-11-15T17:03:26Z

Thanks, @jseyfried, this is very enlightening. I like the idea of spanned_quote!(). With that in place, I assume the plain quote!{} would be removed?

Maybe the following is more of a gnome-class issue than something directly for this issue, but anyway:

In gnome-class we are given a name from the user, say Foo, and generate some other identifiers from it:

        let container_name = |suffix: &str| {
            // InstanceName is an Ident; comes from the caller
            let mut i = Ident::from(format!("{}{}", InstanceName.as_ref(), suffix));
            i.span.0 = Span::call_site();
            return i
        };

        let InstanceNameFfi  = container_name("Ffi");
        let ModuleName       = container_name("Mod");
        let ClassName        = container_name("Class");
        let PrivateClassName = container_name("ClassPrivate");
        let InstanceExt      = container_name("Ext");

Then we use the generated InstanceNameFfi, ClassName, etc. throughout the macro. A couple of those are intended to be visible by the user.

However, PrivateClassName is not intended to be visible - that one should be hygienic.

Right now we depend on @alexcrichton's hacked quote crate, which generates Span::call_site() unconditionally. The container_name lambda above explicitly sets the span of the generated identifier to call_site, too. For the names I need to be hygienic, I guess I can have my own function that doesn't set the spans on the synthesized Idents.

If we had spanned_quote!(), which I assume would do your deep_replace_spans(), wouldn't that change the spans of all my synthesized idents, though?

jseyfried · 2017-11-15T20:56:02Z

@federicomenaquintero

With that in place, I assume the plain quote! {} would be removed?

Perhaps, I think it could still be nice to have quote! {} by itself though, so that e.g.

macro m($($input:tt)*) {
    tokens ... $($input)* ...
}
// is equivalent to
#[proc_macro]
fn m(input: TokenStream) -> TokenStream {
    quote! { tokens ... $input ... }
}

If we had spanned_quote!(), which I assume would do your deep_replace_spans(), wouldn't that change the spans of all my synthesized idents, though?

Good question, I forgot to mention this -- the span in spanned_quote! would only apply to quoted tokens; any interpolated/unquoted tokens would retain their spans. This is how quote! behaves today (otherwise, the above equivalence wouldn't hold).

This means we (I) should prioritize adding spanned_quote! -- quote!+deep_replace_spans isn't sufficient.

@nrc

Rename `Span::default` -> `Span::def_site` I think the explicitness here is warranted. c.f. #45934 r? @nrc

alexcrichton · 2017-11-20T14:55:42Z

Thanks so much for the explanation @jseyfried!

I definitely agree that spanned_quote! is probably needed here, but I'd also just like to confirm, do you agree that quote! should exist and use Span::def_site by default?

jseyfried · 2017-11-20T20:51:09Z

@alexcrichton No prob! Yes, I agree that quote! should exist and use Span::def_site (or something with the same hygiene but better diagnostics) by default.

alexcrichton · 2017-11-21T15:39:08Z

According to @jseyfried's epic comment it sounds like everything here is working as intended. As a result I'm going to classify this as a diagnostics bug as I think we can probably do better in terms of what diagnostics are printed by the compiler in these situations.

In the meantime I've opened dtolnay/quote#55 for the quote crate in the ecosystem so gnome-class can get off a fork :)

jseyfried · 2017-12-07T05:23:00Z

Using Span::default precludes working with "non hygienic macros"

Yeah, this is a limitation. I believe the interaction in this case can be improved, but at the cost of some (maybe significant) implementation complexity.

Fixing this was more straightforward than I thought; implemented in #46551.

@alexcrichton after #46551 lands, the example in your original comment will work.

Generally speaking, if you invoke an unhygienic macro at a hygienic macro's def site (e.g. quote! it into existence), expanded tokens from the unhygienic macro definition will resolve at the hygienic macro's def site (i.e. they can see other quote!ed things in the hygienic macro).

alexcrichton · 2017-12-07T14:58:13Z

@jseyfried awesome!

Enselic · 2023-09-24T17:20:27Z

Triage: Closing as fixed because

it seems fixed based on the last three comments.
There is a tracking issue for Span::def_site() now.

alexcrichton added the A-macros-2.0 Area: Declarative macros 2.0 (#39412) label Nov 11, 2017

alexcrichton mentioned this issue Nov 11, 2017

Ensure all quote! tokens have the default Span dtolnay/quote#51

Closed

alexcrichton added a commit to alexcrichton/quote that referenced this issue Nov 11, 2017

Always generate tokens with the call_site span

a0bc6fe

cc rust-lang/rust#45934

jseyfried self-assigned this Nov 11, 2017

jseyfried mentioned this issue Nov 15, 2017

Rename Span::default -> Span::def_site #45999

Merged

bors added a commit that referenced this issue Nov 18, 2017

Auto merge of #45999 - jseyfried:def_site_span, r=nrc

859c716

Rename `Span::default` -> `Span::def_site` I think the explicitness here is warranted. c.f. #45934 r? @nrc

alexcrichton mentioned this issue Nov 21, 2017

Add a quote_spanned! macro dtolnay/quote#55

Merged

alexcrichton added the A-diagnostics Area: Messages for errors, warnings, and lints label Nov 21, 2017

alexcrichton mentioned this issue Nov 21, 2017

Fix hygiene in various macros rust-lang/stdarch#204

Merged

jseyfried mentioned this issue Dec 17, 2017

decl_macro: imports don't work in cross-crate invocations #42337

Closed

jseyfried mentioned this issue Jan 2, 2018

Tracking issue for RFC 1566: Procedural macros #38356

Closed

31 tasks

This was referenced Jan 3, 2018

Span::resolved_at and Span::located_at to combine behavior of two spans #47149

Merged

Remove impl Default for Span dtolnay/proc-macro2#48

Closed

rustc can no longer resolve my imports inside quote!{}. dtolnay/syn#303

Closed

dtolnay changed the title ~~Macros 2.0: Span::default() vs Span::call_site()~~ Macros 2.0: Span::def_site() vs Span::call_site() Jan 16, 2018

dtolnay mentioned this issue Jan 16, 2018

Using fn params in generated code dtolnay/syn#331

Closed

jkordish added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 25, 2018

federicomenaquintero mentioned this issue Jan 26, 2018

Start macro expansion chapter rust-lang/rustc-dev-guide#26

Merged

ngg mentioned this issue Mar 7, 2018

"nightly" feature of proc-macro2 is not working tokio-rs/prost#88

Closed

alexcrichton mentioned this issue Mar 7, 2018

Use proc_macro2::Span::call_site for all quotes TeXitoi/structopt#77

Merged

alexcrichton mentioned this issue Oct 1, 2018

Tracking issue for Span::def_site() #54724

Open

Enselic closed this as completed Sep 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Macros 2.0: Span::def_site() vs Span::call_site() #45934

Macros 2.0: Span::def_site() vs Span::call_site() #45934

alexcrichton commented Nov 11, 2017

federicomenaquintero commented Nov 11, 2017

jseyfried commented Nov 15, 2017 •

edited

Loading

federicomenaquintero commented Nov 15, 2017

jseyfried commented Nov 15, 2017 •

edited

Loading

alexcrichton commented Nov 20, 2017

jseyfried commented Nov 20, 2017

alexcrichton commented Nov 21, 2017

jseyfried commented Dec 7, 2017 •

edited

Loading

alexcrichton commented Dec 7, 2017

Enselic commented Sep 24, 2023

Macros 2.0: Span::def_site() vs Span::call_site() #45934

Macros 2.0: Span::def_site() vs Span::call_site() #45934

Comments

alexcrichton commented Nov 11, 2017

Examples of odd errors

Using parse breaks super

Using Span::default means you can't import from yourself

Using Span::default means you can't import generated structs

Using Span::default means you can't use external crates

Using Span::default means you can't import from yourself

Using Span::default precludes working with "non hygienic macros"

Conclusions

federicomenaquintero commented Nov 11, 2017

jseyfried commented Nov 15, 2017 • edited Loading

federicomenaquintero commented Nov 15, 2017

jseyfried commented Nov 15, 2017 • edited Loading

alexcrichton commented Nov 20, 2017

jseyfried commented Nov 20, 2017

alexcrichton commented Nov 21, 2017

jseyfried commented Dec 7, 2017 • edited Loading

alexcrichton commented Dec 7, 2017

Enselic commented Sep 24, 2023

Using `parse` breaks `super`

Using `Span::default` means you can't import from yourself

Using `Span::default` means you can't import generated structs

Using `Span::default` means you can't use external crates

Using `Span::default` means you can't import from yourself

Using `Span::default` precludes working with "non hygienic macros"

jseyfried commented Nov 15, 2017 •

edited

Loading

jseyfried commented Nov 15, 2017 •

edited

Loading

jseyfried commented Dec 7, 2017 •

edited

Loading