-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect NulInCStr
error earlier.
#119172
Detect NulInCStr
error earlier.
#119172
Conversation
@fee1-dead: some additional context in this zulip thread. In short, the delayed C NUL str check is inconsistent with all other string literal checks. If it ships in its current state, we're stuck with that behaviour permanently. If we move it earlier right now before it ships, we have the option to delay it (and all other string literal checks) later on (as implemented in #118699). So if we do this in the next few days, we avoid a one-way door shutting. |
Ugh, the refusal of git and GitHub to show "binary" files is really annoying for this one. |
Currently on my phone right now, but do we have a test for |
This comment has been minimized.
This comment has been minimized.
To clarify: the test contains 15 C string literals. Prior to this commit, only the first five have |
I don't think so. Is that interesting? It would produce output like this:
|
I meant something like this: macro_rules! hi {
(c $t:tt) => { $t };
}
fn main() {
println!(hi!(c"hello!\0"))
} do we already have that in our test suite? |
I don't see anything in |
@rustbot labels +I-lang-nominated Nominating this for us to discuss as there is some time pressure here. If we want to save this space, we should do it before C string literals stabilize in Rust 1.76. |
Following up on this: the decision made was to defer the stabilization by a release, so that we don't need to rush this change in over the holidays. |
This comment was marked as resolved.
This comment was marked as resolved.
07d1088
to
16a7e28
Compare
rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead. cc @rust-lang/rust-analyzer |
The rust-analyzer problem has been fixed, and I have fixed the conflicts. @fee1-dead, this is ready for review. Unfortunately, git and GitHub both refuse to show the changed test file because it contains NUL chars and so is "binary". Here's the before:
and the after:
as rendered by vim. |
This comment has been minimized.
This comment has been minimized.
By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error.
16a7e28
to
9018d2c
Compare
That's exactly the point of this PR: to unblock stabilization of C-string literals. Can you unblock? |
@bors r+ |
…etrochenkov Detect `NulInCStr` error earlier. By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error. r? `@fee1-dead`
@nnethercote Just want to double check if you can post a reference PR update as mentioned at https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/rfc.203349.3A.20mixed.20utf8.20literals/near/409285306 to update the grammar and remove the sentence about post-lexing validation? |
…etrochenkov Detect `NulInCStr` error earlier. By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error. r? `@fee1-dead`
…etrochenkov Detect `NulInCStr` error earlier. By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error. r? ``@fee1-dead``
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#119172 (Detect `NulInCStr` error earlier.) - rust-lang#119833 (Make tcx optional from StableMIR run macro and extend it to accept closures) - rust-lang#119955 (Modify GenericArg and Term structs to use strict provenance rules) - rust-lang#120021 (don't store const var origins for known vars) - rust-lang#120038 (Don't create a separate "basename" when naming and opening a MIR dump file) - rust-lang#120057 (Don't ICE when deducing future output if other errors already occurred) - rust-lang#120073 (Remove spastorino from users_on_vacation) r? `@ghost` `@rustbot` modify labels: rollup
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#119172 (Detect `NulInCStr` error earlier.) - rust-lang#119833 (Make tcx optional from StableMIR run macro and extend it to accept closures) - rust-lang#119967 (Add `PatKind::Err` to AST/HIR) - rust-lang#119978 (Move async closure parameters into the resultant closure's future eagerly) - rust-lang#120021 (don't store const var origins for known vars) - rust-lang#120038 (Don't create a separate "basename" when naming and opening a MIR dump file) - rust-lang#120057 (Don't ICE when deducing future output if other errors already occurred) - rust-lang#120073 (Remove spastorino from users_on_vacation) r? `@ghost` `@rustbot` modify labels: rollup
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#119172 (Detect `NulInCStr` error earlier.) - rust-lang#119833 (Make tcx optional from StableMIR run macro and extend it to accept closures) - rust-lang#119967 (Add `PatKind::Err` to AST/HIR) - rust-lang#119978 (Move async closure parameters into the resultant closure's future eagerly) - rust-lang#120021 (don't store const var origins for known vars) - rust-lang#120038 (Don't create a separate "basename" when naming and opening a MIR dump file) - rust-lang#120057 (Don't ICE when deducing future output if other errors already occurred) - rust-lang#120073 (Remove spastorino from users_on_vacation) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#119172 - nnethercote:earlier-NulInCStr, r=petrochenkov Detect `NulInCStr` error earlier. By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error. r? ```@fee1-dead```
Which moved the checking for NUL chars in C string literals earlier.
Thanks for the reminder, done in rust-lang/reference#1450. |
Which moved the checking for NUL chars in C string literals earlier.
Which moved the checking for NUL chars in C string literals earlier.
Which moved the checking for NUL chars in C string literals earlier.
…etrochenkov Detect `NulInCStr` error earlier. By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars. NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together. One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error. r? ```@fee1-dead```
By making it an
EscapeError
instead of aLitError
. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars.NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together.
One nice thing about this: the old approach had some code in
report_lit_error
to calculate the span of the nul char from a range. This code used a hardwired+2
to account for thec"
at the start of a C string literal, but this should have changed to a+3
for raw C string literals to account for thecr"
, which meant that the caret incr"
nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error.r? @fee1-dead