Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement the lexing of c"…" string literals with backward compatibility in mind #113333

Closed
Tracked by #105723
fmease opened this issue Jul 4, 2023 · 6 comments · Fixed by #113476
Closed
Tracked by #105723

Reimplement the lexing of c"…" string literals with backward compatibility in mind #113333

fmease opened this issue Jul 4, 2023 · 6 comments · Fixed by #113476
Labels
C-bug Category: This is a bug. F-c_str_literals `#![feature(c_str_literals)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@fmease
Copy link
Member

fmease commented Jul 4, 2023

The lexing of c"…" string literals as implemented in #108801 had to be reverted in #113334 since it isn't backward compatible.
It breaks code that uses a pre-2021 edition (i.e. the 2015 or the 2018 edition).
See #113235 (comment) for details.

@rustbot label C-bug T-compiler F-c_str_literals

@rustbot rustbot added C-bug Category: This is a bug. F-c_str_literals `#![feature(c_str_literals)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 4, 2023
@fmease fmease changed the title Reimplement the lexing of c"…" string literals with backward compatiblity in mind Reimplement the lexing of c"…" string literals with backward compatibility in mind Jul 4, 2023
@fee1-dead fee1-dead self-assigned this Jul 5, 2023
fee1-dead added a commit to fee1-dead-contrib/rust that referenced this issue Jul 6, 2023
…=compiler-errors

Revert the lexing of `c"…"` string literals

Fixes \[after beta-backport\] rust-lang#113235.
Further progress is tracked in rust-lang#113333.

This PR *manually* reverts parts of rust-lang#108801 (since a git-revert would've been too coarse-grained & messy)
and git-reverts rust-lang#111647.

CC `@fee1-dead` (rust-lang#108801) `@klensy` (rust-lang#111647)
r? `@compiler-errors`

`@rustbot` label F-c_str_literals beta-nominated
@fee1-dead fee1-dead removed their assignment Jul 7, 2023
@SUPERCILEX
Copy link
Contributor

Could c strings be added to 2021 editions+ only?

@dead-claudia
Copy link

Could c strings be added to 2021 editions+ only?

@SUPERCILEX On paper, this sounds like a good idea, but in practice, one does not merely do that. Proc macros could use an older version while the source using it could use a newer version.

In this case, the newer source code, when parsed, will yield one AST, while the proc macro may break due to those unfamiliar features.

I'm not familiar enough with the compiler to know how it addresses it, so don't take that as authoritative.

@fee1-dead
Copy link
Member

I think the plan is to only support this on 2021+ editions while treating it as an unknown prefix in 2018 or older, which would allow compatibility with macros that want c"xx" to be treated as two separate tokens.

As for macros across crates, I think it wouldn't be an issue? There should already be mechanisms to prevent this sort of thing from happening. We should definitely add a test case to test this behavior.

@nagisa
Copy link
Member

nagisa commented Jul 16, 2023

We do already recombine some tokens in the parser (I believe & and & is an example of this? Or maybe the other way around where && is contextually interpreted as a double reference? I don’t remember exact details.)

Seems like it wouldn’t be terribly hard to do it for c"" either, as long as care is taken to ensure that no whitespace is between the two tokens.

(edit: this is in response to #113235 (comment))

@fmease
Copy link
Member Author

fmease commented Jul 16, 2023

Yup, there's break_and_eat (and its various convenience methods). However, as already mentioned, every user of the lexer (ra-ap-rustc_lexer to be more precise) would need to remember this forgettable step.

FYI, there's already a PR that would fix this issue (by mainly modifying the lexer, not the parser; at the time of this writing), namely #113476 (although it doesn't mention this issue).

@dead-claudia
Copy link

dead-claudia commented Jul 17, 2023

We do already recombine some tokens in the parser (I believe & and & is an example of this? Or maybe the other way around where && is contextually interpreted as a double reference? I don’t remember exact details.)

IIRC from the spec it's && being contextually interpreted as a double reference much like >> as closing two generic types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. F-c_str_literals `#![feature(c_str_literals)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants