Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reporting from attribute macros regressed in 1.46.0 #76360

Open
ahl opened this issue Sep 5, 2020 · 19 comments
Open

Error reporting from attribute macros regressed in 1.46.0 #76360

ahl opened this issue Sep 5, 2020 · 19 comments
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-proc-macros Area: Procedural macros ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-diagnostics Working group: Diagnostics

Comments

@ahl
Copy link
Contributor

ahl commented Sep 5, 2020

We take particular care to make sure that compilation errors associated with the items to which our attribute macros are applied are comprehensible. To that end we have tests to check the output of a variety of expected illegal programs to make sure developers could reasonably be expected to understand and correct those errors. In 1.46.0 we noticed a change in that behavior. While there were some beneficial changes to the way errors qualify structs, traits, etc. there seems to be a regression where 1.46.0 shows less information than in 1.45.0

I've put together a small repo that demonstrates the problem: https://github.com/ahl/span_regression

The first example is this macro:

#[proc_macro_attribute]
pub fn identity(
    _attr: proc_macro::TokenStream,
    item: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    item
}

Now consider this test program:

struct Thing {
    x: i32,
    y: i32,
}

#[identity]
fn doit() -> Thing {
    Thing {
        "howdy".to_string(),
    }
}

Under 1.45.0 one of the errors produced looks like this:

error[E0063]: missing fields `x`, `y` in initializer of `Thing`
  --> examples/test.rs:10:5
   |
10 |     Thing {
   |     ^^^^^ missing `x`, `y`

Under 1.46.0 that has changed to this:

error[E0063]: missing fields `x`, `y` in initializer of `Thing`

Note that under 1.46.0 no code is underlined.

Now consider a slightly more complicated macro:

#[proc_macro_attribute]
pub fn nested(
    _attr: proc_macro::TokenStream,
    item: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    let mut ret = proc_macro::TokenStream::new();
    let pre = vec![
        proc_macro::TokenTree::Ident(proc_macro::Ident::new("fn", proc_macro::Span::call_site())),
        proc_macro::TokenTree::Ident(proc_macro::Ident::new("foo", proc_macro::Span::call_site())),
        proc_macro::TokenTree::Group(proc_macro::Group::new(
            proc_macro::Delimiter::Parenthesis,
            proc_macro::TokenStream::new(),
        )),
        proc_macro::TokenTree::Group(proc_macro::Group::new(proc_macro::Delimiter::Brace, item)),
    ];
    ret.extend(pre);
    ret
}

(Note I had been using quote! but wanted to make sure that wasn't causing the problem)

With a similar example as above on 1.45.0:

error: expected identifier, found `"yo"`
  --> examples/test.rs:20:9
   |
19 |     Thing {
   |     ----- while parsing this struct
20 |         "yo".to_string(),
   |         ^^^^ expected identifier

With 1.46.0:

error[E0308]: mismatched types
  --> examples/test.rs:17:1
   |
17 | #[nested]
   | ^^^^^^^^^
   | |
   | expected struct `Thing`, found `()`
   | expected `Thing` because of return type

Rather than pointing to the offending code, the rustc error now points (unhelpfully) to the macro itself.

The general improvements made to error reporting in 1.46.0 (simpler naming, reduced duplicate errors) is greatly appreciated. This was one small regression I saw amongst the otherwise monotonic improvements.

@ahl ahl added the C-bug Category: This is a bug. label Sep 5, 2020
@jonas-schievink jonas-schievink added A-proc-macros Area: Procedural macros regression-from-stable-to-stable Performance or correctness regression from one stable version to another. labels Sep 5, 2020
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Sep 5, 2020
@LeSeulArtichaut LeSeulArtichaut added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Sep 5, 2020
@LeSeulArtichaut
Copy link
Contributor

If I remember correctly, we should still be able to use cargo-bisect-rustc to bisect a regression in 1.46.0.
@rustbot ping cleanup

@rustbot rustbot added the ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections label Sep 5, 2020
@rustbot
Copy link
Collaborator

rustbot commented Sep 5, 2020

Hey Cleanup Crew ICE-breakers! This bug has been identified as a good
"Cleanup ICE-breaking candidate". In case it's useful, here are some
instructions for tackling these sorts of bugs. Maybe take a look?
Thanks! <3

cc @AminArria @camelid @chrissimpkins @contrun @DutchGhost @elshize @ethanboxx @h-michael @HallerPatrick @hdhoang @hellow554 @imtsuki @kanru @KarlK90 @LeSeulArtichaut @MAdrianMattocks @matheus-consoli @mental32 @nmccarty @Noah-Kennedy @pard68 @PeytonT @pierreN @Redblueflame @RobbieClarken @RobertoSnap @robjtede @SarthakSingh31 @senden9 @shekohex @sinato @spastorino @turboladen @woshilapin @yerke

@LeSeulArtichaut LeSeulArtichaut added A-diagnostics Area: Messages for errors, warnings, and lints P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Sep 5, 2020
@LeSeulArtichaut
Copy link
Contributor

@estebank
Copy link
Contributor

estebank commented Sep 5, 2020

Cc @petrochenkov

@Aaron1011
Copy link
Member

Aaron1011 commented Sep 5, 2020

This is caused by #73345 - since we no longer pass a Nonterminal to an attribute macro, the pretty-print/reparse check get used (and fails), which causes us to lose spans.

This is a very interesting corner case - the span_regression error is testing the behavior of a proc-macro called on a syntactically invalid Item. It looks like this relies on the parser managing to recover an Item, despite the fact that the body is syntactically invalid.

I think there are two possible ways to fix this:

  1. Disable the pretty-print/reparse check when we recovered from parse errors during the attribute target - this would restore the 1.45.0 behavior of the span_regression repository. However, this would effectively mean that the parser recovery behavior is now part of the stable surface of the compiler, instead of just a trick for producing more meaningful error messages. Any future changes to parser recovery could cause us to start invoking an attribute macro, if we were able to parse enough of the target to produce an AST struct.
  2. Skip invoking attribute macros if there were any errors during the parsing of the attribute target, regardless of recovery behavior. Since attributes produce an arbitrary TokenStream as output, this would require us to remove the target item entirely (since we can't know what the attribute macro would have produced). Effectively, custom attribute macros would disable parser recovery.

I'm leaning towards option 2 (assuming that it doesn't cause significant regressions in practice). Bang-proc-macros are the way to invoke a proc-macro on an arbitrary TokenStream - however, attributes macros are (syntactically) just another attribute, so it doesn't seem to make sense to invoke them on something that's not a syntactically valid attribute target. While the span_regression example is 'good enough' for recovery to be able to kick in, I think going with approach 1 would cause us to end up with an unprincipled list of a special cases where we invoke an attribute macro on a syntactically invalid target.

On the other hand, it's possible that option 2 could lead to a large number of spurious errors in practice - if the user introduces a syntax error when modifying a commonly-used struct (which has an attribute macro applied), removing this struct entirely could cause many spurious resolution errors. Invoking the attribute macro might lead to better (but still invalid) tokens being produced, leading to cleaner error messages.

@Aaron1011
Copy link
Member

Aaron1011 commented Sep 5, 2020

@ahl Thanks for providing such a nicer reproducer!

@Aaron1011 Aaron1011 removed the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Sep 5, 2020
@ahl
Copy link
Contributor Author

ahl commented Sep 5, 2020

@Aaron1011 thanks for the quick diagnosis.

I agree with your assessment that both options have downsides (I think in your final paragraph you're describing option 2 rather than option 1 vis-a-vis removing the struct entirely).

Option 1 seems like a bit of a mess; option 2 seems more "philosophically pure" while being a little less pragmatic. I agree that option 2 is probably preferable and that a crisp semantic like "only valid items will cause the invocation of attribute macros" is a very clear delineation between proc_macro and proc_macro_attribute.

@estebank
Copy link
Contributor

estebank commented Sep 6, 2020

@Aaron1011 I agree with your assessment and think that the second option is the obviously correct choice for us. This might cause some weird errors for proc macros with pseudo DSLs that are no longer turned into valid code, but given that we are already recovering the parse most of the body will be treated as empty anyways, so it should be fine™.

@camelid
Copy link
Member

camelid commented Sep 6, 2020

Could we also add something where the compiler emits one of its “this is a bug” messages if an error doesn’t show span information? Or is that not feasible?

@Aaron1011
Copy link
Member

@camelid: Right now, we deliberately create these 'dummy spans' due to issue #43081. PR #76130 makes progress towards eliminating these cases, but there's still more work to be done.

In the future, I think it would definitely make sense to emit a warning, and direct users to open a bug report. However, I don't think we'd want to emit an actual ICE - it would just create noise, since the query stack and call stack aren't useful for debugging missing spans.

@camelid
Copy link
Member

camelid commented Sep 6, 2020

@Aaron1011 Okay, thanks!

@Aaron1011
Copy link
Member

The pretty-print/reparse check has been removed on the latest Nighty, leading to a better error message:

error: expected identifier, found `"howdy"`
  --> examples/test.rs:11:9
   |
10 |     Thing {
   |     ----- while parsing this struct
11 |         "howdy".to_string(),
   |         ^^^^^^^ expected identifier

error: expected identifier, found `"yo"`
  --> examples/test.rs:18:9
   |
17 |     Thing {
   |     ----- while parsing this struct
18 |         "yo".to_string(),
   |         ^^^^ expected identifier

error[E0063]: missing fields `x`, `y` in initializer of `Thing`
  --> examples/test.rs:10:5
   |
10 |     Thing {
   |     ^^^^^ missing `x`, `y`

error[E0063]: missing fields `x`, `y` in initializer of `Thing`
  --> examples/test.rs:17:5
   |
17 |     Thing {
   |     ^^^^^ missing `x`, `y`

error: aborting due to 4 previous errors

However, we should still come to a decision about whether or not proc-macros should be invoked when we've managed to recover a syntactically invalid attribute target.

@dtolnay
Copy link
Member

dtolnay commented Oct 25, 2021

From the discussion above, three people have already voiced in favor of option 2 but I don't think I understand what downside is being claimed for option 1:

  1. Disable the pretty-print/reparse check when we recovered from parse errors during the attribute target - this would restore the 1.45.0 behavior of the span_regression repository. However, this would effectively mean that the parser recovery behavior is now part of the stable surface of the compiler, instead of just a trick for producing more meaningful error messages. Any future changes to parser recovery could cause us to start invoking an attribute macro, if we were able to parse enough of the target to produce an AST struct.

"Parser recovery behavior is now part of the stable surface of the compiler" — I don't see how, since I thought error diagnostics get emitted whenever a recovery takes place. Rustc sees invalid syntax, thinks it knows roughly what you mean or what the surrounding code means, emits diagnostics to that effect, and continues on with a salvaged input. Anything later in the compilation process, including name resolution or attribute macros being run, is a trick for producing more meaningful error messages, because compilation of the enclosing crate is already guaranteed not to succeed due to the errors emitted at the parser recovery.

Under option 1 how would anyone end up with a stable working crate that would be broken by future changes to parser recovery?

I think going with approach 1 would cause us to end up with an unprincipled list of a special cases where we invoke an attribute macro on a syntactically invalid target.

This is saying we run the risk of an unprincipled list of special cases on which contemporary Rust versions produce good diagnostics, as an argument to produce poor diagnostics consistently in all situations instead. It's hard for me to see this as an argument against option 1.

On the other hand, it's possible that option 2 could lead to a large number of spurious errors in practice - if the user introduces a syntax error when modifying a commonly-used struct (which has an attribute macro applied), removing this struct entirely could cause many spurious resolution errors.

This is the clear and succinct argument in favor of option 1.

@dtolnay
Copy link
Member

dtolnay commented Oct 25, 2021

I ended up here from #90256, in which I reported that the 1.51+ behavior of passing original syntactically invalid input into attribute macros after a parser recovery is a diagnostics regression from older compilers that would only pass syntactically valid (possibly recovered) input to macros.

Doing recovery, emitting diagnostics about the recovery, and then running macros on the original unrecovered input means that macros must reimplement their own recovery independent and inconsistent with rustc — so rustc will diagnose what it thinks you meant, and the macro will diagnose what it thinks, and probably they won't align. The correct behavior in my opinion is for rustc to perform its high-effort syntax recovery and report diagnostics on it as already done, then pass the recovered (pretty printed if necessary) input for the attribute to proceed on. I believe this is consistent with the option 1 proposal.

@dtolnay dtolnay added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed C-bug Category: This is a bug. labels Oct 25, 2021
@Aaron1011
Copy link
Member

The correct behavior in my opinion is for rustc to perform its high-effort syntax recovery and report diagnostics on it as already done, then pass the recovered (pretty printed if necessary

A lot of work has been put in to get rid of pretty-printing/reparsing, mainly because it requires throwing away hygiene information. While you're correct that this would only apply when compilation is guaranteed to fail, I think that spurious hygiene-related errors ( e.g. an 'not found' error that goes away when you fix a seemingly unrelated syntax error) would be extremely confusing.

It might not be too difficult to perform some kind of 'token-based recovery' (deleting the tokens for the syntactically invalid node being recovered). I'll look into that.

@dtolnay
Copy link
Member

dtolnay commented Oct 25, 2021

A token-based recovery / deleting tokens of the nearest invalid node sounds awesome! Thanks.

@Aaron1011
Copy link
Member

Under option 1 how would anyone end up with a stable working crate that would be broken by future changes to parser recovery?

For the sake of completeness - a macro could perform a side effect using the input token stream. However, I do t think that's a use case we want to support (and such a macro would already need to handle being invoked multiple times across different compilation sessions).

@pnkfelix
Copy link
Member

Discussed at P-high review

Doing recovery, emitting diagnostics about the recovery, and then running macros on the original unrecovered input means that macros must reimplement their own recovery independent and inconsistent with rustc — so rustc will diagnose what it thinks you meant, and the macro will diagnose what it thinks, and probably they won't align.

This seems like a real problem. I'm asking @estebank and @compiler-errors to try to dedicate some brain cycles to looking over the suggestions here and trying to drive progress towards a solution.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 3, 2023

2023 Q1 P-high triage: Assigning to WG-diagnostics to drive aforementioned followup work.

@pnkfelix pnkfelix added the WG-diagnostics Working group: Diagnostics label Mar 3, 2023
@estebank estebank self-assigned this Mar 3, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 16, 2023
Do not provide suggestions when the spans come from expanded code that doesn't point at user code

Hide invalid proc-macro suggestions and track spans
coming from proc-macros pointing at attribute.

Effectively, unless the proc-macro keeps user spans,
suggestions will not be produced for the code they
produce.

r? `@ghost`

Fix rust-lang#107113, fix rust-lang#107976, fix rust-lang#107977, fix rust-lang#108748, fix rust-lang#106720, fix rust-lang#90557.

Could potentially address rust-lang#50141, rust-lang#67373, rust-lang#55146, rust-lang#78862, rust-lang#74043, rust-lang#88514, rust-lang#83320, rust-lang#91520, rust-lang#104071. CC rust-lang#50122, rust-lang#76360.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 17, 2023
Do not provide suggestions when the spans come from expanded code that doesn't point at user code

Hide invalid proc-macro suggestions and track spans
coming from proc-macros pointing at attribute.

Effectively, unless the proc-macro keeps user spans,
suggestions will not be produced for the code they
produce.

r? ``@ghost``

Fix rust-lang#107113, fix rust-lang#107976, fix rust-lang#107977, fix rust-lang#108748, fix rust-lang#106720, fix rust-lang#90557.

Could potentially address rust-lang#50141, rust-lang#67373, rust-lang#55146, rust-lang#78862, rust-lang#74043, rust-lang#88514, rust-lang#83320, rust-lang#91520, rust-lang#104071. CC rust-lang#50122, rust-lang#76360.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-proc-macros Area: Procedural macros ICEBreaker-Cleanup-Crew Helping to "clean up" bugs with minimal examples and bisections P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-diagnostics Working group: Diagnostics
Projects
None yet
Development

No branches or pull requests

9 participants