Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for proc_macro::Span inspection APIs #54725

Open
alexcrichton opened this issue Oct 1, 2018 · 115 comments
Open

Tracking issue for proc_macro::Span inspection APIs #54725

alexcrichton opened this issue Oct 1, 2018 · 115 comments
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) A-proc-macros Area: Procedural macros B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. S-tracking-design-concerns Status: There are blocking ❌ design concerns. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Oct 1, 2018

This issue is intended to track a number of unstable APIs which are used to inspect the contents of a Span for information like the file name, byte position, manufacturing new spans, combining them, etc.

This issue tracks the proc_macro_span unstable feature.

Public API

Already stabilized:

impl Span {
    pub fn source_text(&self) -> Option<String>;
}

impl Group {
    pub fn span_open(&self) -> Span;
    pub fn span_close(&self) -> Span;
}

To be stabilized, probably in their current form:

impl Span {
    pub fn line(&self) -> usize;
    pub fn column(&self) -> usize;

    pub fn start(&self) -> Span;
    pub fn end(&self) -> Span;
}

To be stabilized after some (re)design or discussion:

impl Span {
    pub fn source_file(&self) -> SourceFile;

    pub fn byte_range(&self) -> Range<usize>;
}

#[derive(Clone, Debug, PartialEq, Eq)]
pub struct SourceFile { .. }

impl !Send for SourceFile {}
impl !Sync for SourceFile {}

impl SourceFile {
    pub fn path(&self) -> PathBuf;
    pub fn is_real(&self) -> bool;
}

Things that require more discussion:

impl Span {
    pub fn eq(&self, other: &Span) -> bool;
    pub fn join(&self, other: Span) -> Option<Span>;
    pub fn parent(&self) -> Option<Span>;
    pub fn source(&self) -> Span;
}

impl Literal {
    pub fn subspan<R: RangeBounds<usize>>(&self, range: R) -> Option<Span>;
}
@alexcrichton alexcrichton added A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Oct 1, 2018
alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 1, 2018
@softprops
Copy link

I wanted to shed some pain this is causing with tarpaulin.

Tarpaulin has worked amazingly well for me and my company as a replacement for kcov which historically been a pain to get accurate, reliable and correct results. At the moment tarpaulin stands as the most promising goto option for codecoverage in rust and just feels more like the only first class tooling option.

Having one of those when choosing to invest in a technology is important for many company's adoption story for checking off code quantity checkmarks. When they see that rust doesn't have a reasonable code quality story that works on stable rust, that's can result in a "pass" rather than "I'll take it". There are currently some work arounds for making this work-ish on stable but it feels very much like the story serde was in a year or so ago when you wanted to show all your friends how amazing serde was but then was embarrassed to show what it took to make work on stable because of a macro stabilization blocker.

@JakeTherston
Copy link

With procedural macros having reached a point where they're very useful on stable, I expect many users will find themselves needing access to this information. Would it be reasonable to only stabilize parts of the Span API that are not too risky? Perhaps exposing a function that optionally returns the path of the file where a macro is invoked if such a file exists?

kevinmehall added a commit to kevinmehall/rust-peg that referenced this issue Nov 3, 2018
This unstable feature (rust-lang/rust#54725) is the last thing that we
require in Nightly. Removing it will cause a significant regression in
error messages, but that can be improved after switching to parsing the
grammar as tokens rather than as a string literal.
@matklad
Copy link
Member

matklad commented Feb 1, 2019

I have a concern about exposing LineColum information. It looks like it could play badly with incremental compilation, especially in the IDE context.

My understanding is that, if one adds a blank line to the start of the file, the line_column information of the input spans to proc macro changes. That means that IDE would have to re-expand procedural macros even after insignificant white space changes.

I would feel more confident if proc-macros were strictly a pure function from the input token stream to output token stream. This can be achieved, for example, by making line-column infocation relative to the start of macro invocation (as opposed to relative to the start of the file).

I don't feel that exposing absolute position is necessary the end of the world: IDE can track if a macro actually makes use of the info, or it can supply macro with some fake line/columns. But it's hard to tell if such hacks would work well in practice, and it's hard to experiment with them for the lack of IDE which handled proc-macros incrementally....

@davidlattimore
Copy link
Contributor

If the parser allocated IDs to every AST node, then, and this is the hard part, when an edit was made to the source, the parser tried to keep those IDs the same in all the non-edited code and only allocate new IDs for new nodes, that would allow spans to be kept completely separate from the AST. Those IDs could be passed through macro expansion without causing unnecessary invalidations. If something needed a span later on, it could then go back and ask the parser for the span for that particular AST node ID. I feel like having an incremental parser is important, not because parsing is the bottleneck, but because it underpins everything else.

@matklad
Copy link
Member

matklad commented Feb 2, 2019

@davidlattimore this is fascinating, but slightly off-topic for the issue. I've created a thread on internals: https://internals.rust-lang.org/t/macros-vs-incremental-parsing/9323

@est31
Copy link
Member

est31 commented May 30, 2019

The column!() macro as well as std::panic::Location::column are returning 1-based columns while the span available from the proc-macro crate is 0-based according to its docs. Is this inconsistency intended?

@est31
Copy link
Member

est31 commented May 30, 2019

This thread has more discussion about 1-based columns: #46762 (comment)

@est31
Copy link
Member

est31 commented May 30, 2019

Another open question is how this API relates to #47389 which is about minimizing span information throughout the compiler. Should stabilization be blocked until a design for #47389 is found? Is it too late already as we have set_span functionality? @michaelwoerister what do you think?

@michaelwoerister
Copy link
Member

#47389 is mostly concerned about data types that are used later in the compilation pipeline, such as type information and MIR. Exposing things at the proc-macro level should not be too bad.

@est31
Copy link
Member

est31 commented Jun 1, 2019

But rust-analyzer might one day expand the scope of incremental compilation to the parsing stage, right?

@m-ou-se
Copy link
Member

m-ou-se commented Jun 21, 2023

IMO the source_file() API could be as simple as fn path(&self) -> Option<Path>

It's dead simple and gives us the ecosystem win everyone is looking for, and is probably the least contentious version of this

Would that give the mapped or unmapped path? See #54725 (comment)

@Qix-
Copy link

Qix- commented Jun 21, 2023

@dtolnay and I briefly discussed this moments ago, and we think that SourceFile should have both a method for the mapped path and for the actual on-disk path.

  • virtual_path(&self) -> PathBuf
  • real_path(&self) -> PathBuf
  • #[deprecated] path(&self) -> PathBuf { self.virtual_path() }

These would make sense to at least me. Having more methods for this sort of thing is always better than weird semantics that require documentation to understand.

@sam0x17
Copy link

sam0x17 commented Jun 21, 2023

The two methods proposal seems reasonable 💯

I think we should decide that one of them is source_path and one of them has a longer name. My assumption would be source_path would be a real file path that you can read, and mapped_source_path would be the other one.

I don't care at all about the naming though let's just stabilize something that gives us all access to these goodies

TaKO8Ki added a commit to TaKO8Ki/rust that referenced this issue Jun 27, 2023
Implement proposed API for `proc_macro_span`

As proposed in [rust-lang#54725 (comment)](rust-lang#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Jun 28, 2023
Implement proposed API for `proc_macro_span`

As proposed in [rust-lang#54725 (comment)](rust-lang#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
RalfJung pushed a commit to RalfJung/miri that referenced this issue Jun 29, 2023
Implement proposed API for `proc_macro_span`

As proposed in [#54725 (comment)](rust-lang/rust#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Aug 24, 2023
Implement proposed API for `proc_macro_span`

As proposed in [#54725 (comment)](rust-lang/rust#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
@Caellian
Copy link

Caellian commented Oct 13, 2023

Source path

worried that exposing source_file before rust-lang/rfcs#3200 will lead to more subtly broken proc-macros

I don't see relative manual imports breaking, but even if they do I don't think postponing stabilization of a mostly agreed upon feature over an RFC that hasn't been accepted yet over fear that people might use it wrong is warranted.

  • #[deprecated] path(&self) -> PathBuf { self.virtual_path() }

Path isn't stable yet so deprecation isn't needed (it can just be replaced). Bare path function name is ambiguous if it can have two meanings. Though it's possible to determine manifest relative path using CARGO_MANIFEST_DIR.

mapped_source_path would be the other one.

I don't think that's clear enough. Anyway, to push things along I've created a POLL where people can pick multiple names they think convey the meaning of the description above them and I hope we'll have an agreement on naming by the end of the year.

If is_real is false I'd expect project manifest relative path returning None.

Other comments

eq: Two spans are equal if their byte ranges are equal and their absolute system paths are equal. Tokens shouldn't use spans for equality though as that would provide better composability (can be additionally checked if needed). Why isn't this impl PartialEq + Eq?

join: I don't see any signature other than fn join(&self, other: &Self) -> Option<Self> making sense - if they don't overlap it's None, otherwise minmax of both.

This issue: Is super old, there's over a 100 comments already and it's really hard to catch up on the discussion at this point. I can't see any drawbacks in "To be stabilized, probably in their current form" section and it's been a long time since last comment.

Use cases

I mostly plan on doing evil things with these:

  • Partial/fuzzy type inference and transformation.
  • Generating Deref-ed repr(C) struct for composed repr(C) structs so that Deref allows access to all nested fields (e.g. SVG spec Attribute bundles).

@Nemo157
Copy link
Member

Nemo157 commented Oct 13, 2023

I don't see relative manual imports breaking

They are already broken, if you change the file they read without changing any source code cargo doesn't notice that and won't rebuild the crate. They are also relying on a limitation of the current proc-macro implementation in rustc: it does not cache the output as part of the incremental cache. Once caching is implemented they should not be rerun even when the source changes if that doesn't affect their input.

@menasheofd
Copy link

Hello! I'm interested in writing a proc macro that utilizes the feature of obtaining the caller's file location. Is there any new update on this issue?

@sam0x17
Copy link

sam0x17 commented Mar 12, 2024

Hello! I'm interested in writing a proc macro that utilizes the feature of obtaining the caller's file location. Is there any new update on this issue?

the capabilities remain the same in stable. There are some very cursed patterns you can do if there is a unique string you know will only appear in one file in the crate, like my very cursed macro-gpt does here: https://github.com/sam0x17/macro-gpt/blob/main/src/lib.rs#L239-L272

In practice this works great, but it is intended more as an IDE-extension-living-in-a-proc-macro than a real proc macro you would keep in your source code. In fact it is designed so that when you hit CTRL+S in your editor, the invocation disappears and is replaced with the expansion from gpt.

lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
Implement proposed API for `proc_macro_span`

As proposed in [#54725 (comment)](rust-lang/rust#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
@dbsxdbsx
Copy link

dbsxdbsx commented Apr 10, 2024

I come here when I need to write a procedural macro which could get the file path the macro is called, chatGpt give me the code like this:

#[proc_macro]
pub fn example_macro(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    // Get the `Span` where the macro was called
    let span = proc_macro::Span::call_site();

    // Use the `Span` to get the `SourceFile`
    let source_file = span.source_file();

    // Get the path of the `SourceFile`
    let path: PathBuf = source_file.path();

    // Now you can use the file path for further processing
    // ...

    // Return the original input for this example
    input
}

Then fn source_file pointing me here. Am I at the right place? If so, due to the API is unstable, is there a workaround?

@wyatt-herkamp
Copy link
Contributor

wyatt-herkamp commented Apr 10, 2024

I come here when I need to write a procedural macro which could get the file path the macro is called, chatGpt give me the code like this:

#[proc_macro]
pub fn example_macro(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    // Get the `Span` where the macro was called
    let span = proc_macro::Span::call_site();

    // Use the `Span` to get the `SourceFile`
    let source_file = span.source_file();

    // Get the path of the `SourceFile`
    let path: PathBuf = source_file.path();

    // Now you can use the file path for further processing
    // ...

    // Return the original input for this example
    input
}

Then fn source_file pointing me here. Am I at the right place? If so, due to the API is unstable, is there a workaround?

source_file is unstable. Yes, a few workarounds exist. Such as what @sam0x17 said #54725 (comment)
However, if you need something a little less sketchy I would just have your macro take an input of the source file location in the attributes. It would be less likely to break.

You could also look at
https://docs.rs/proc-macro2/latest/proc_macro2/#unstable-features

@sam0x17
Copy link

sam0x17 commented Apr 10, 2024

However, if you need something a little less sketchy I would just have your macro take an input of the source file location in the attributes. It would be less likely to break.

Although in a multi-crate workspace this tends to break as well when you deploy to crates.io if any of the files are up one or more directories from the invocation, as I frequently encounter in my docify crate

@dbsxdbsx
Copy link

@wyatt-herkamp, @sam0x17 , thanks! This workaround is feasible in my case, which wrapping a trait in a functional macro, and I modified it a little into this:

struct TraitVisitor {
    trait_name: String,
    found: Option<Macro>,
}

impl<'ast> Visit<'ast> for TraitVisitor {
    fn visit_macro(&mut self, mac: &'ast Macro) {
        if self.found.is_some() {
            return;
        }
        let last_seg = mac.path.segments.last().unwrap();
        if last_seg.ident != "trait_variable" {
            return;
        }

        // Convert the macro body tokens into a vector of Ident
        let idents: Vec<Ident> = mac.tokens.clone().into_iter().filter_map(|tt| match tt {
            proc_macro2::TokenTree::Ident(ident) => Some(ident),
            _ => None,
        }).collect();

        // Check for the presence of 'trait' keyword followed by the desired trait name
        for i in 0..idents.len() - 1 {
            if idents[i] == "trait" && idents[i + 1] == self.trait_name {
                println!("found trait: {:?}", self.trait_name);
                self.found = Some(mac.clone());
                break;
            }
        }
    }
}

RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
Implement proposed API for `proc_macro_span`

As proposed in [#54725 (comment)](rust-lang/rust#54725 (comment)). I have omitted the byte-level API as it's already available as [`Span::byte_range`](https://doc.rust-lang.org/nightly/proc_macro/struct.Span.html#method.byte_range).

`@rustbot` label +A-proc-macros

r? `@m-ou-se`
@TheCataliasTNT2k
Copy link

Would be great to get the source file...

@Decodetalkers
Copy link

Emm. if without this feature, we cannot write something like include_str!() right?

@kanashimia
Copy link

Emm. if without this feature, we cannot write something like include_str!() right?

No, this issue is unrelated. See two proposals mentioned here for doing that: #54725 (comment)

@kanashimia

This comment was marked as off-topic.

@workingjubilee
Copy link
Contributor

I have a concern about exposing LineColum information. It looks like it could play badly with incremental compilation, especially in the IDE context.

My understanding is that, if one adds a blank line to the start of the file, the line_column information of the input spans to proc macro changes. That means that IDE would have to re-expand procedural macros even after insignificant white space changes.

I believe that rustc now uses a span model that is compatible with this ask, but currently doesn't use it during macro expansion, ironically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) A-proc-macros Area: Procedural macros B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. S-tracking-design-concerns Status: There are blocking ❌ design concerns. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests