-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize reserved prefixes #88140
Comments
@rfcbot fcp merge I propose that we stabilize reserved prefixes. Note that they will not be exposed to stable completely until Rust 2021 is stabilized. |
Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
@rfcbot concern no-reference-PR We need to describe these changes in the Rust reference! |
This affects |
This FCP needs to start today to make it in time for 1.56. |
@nikomatsakis I don't think that needs to block the start of the FCP, right? Can you resolve your concern? We can handle the reference while the FCP is in progress. |
@rfcbot resolve no-reference-PR I'm going to mark this as resolved for now but try to get this done ASAP |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
AFAICT this is done. |
Reserved prefixes stabilization report
Links
Summary
any_identifier#
,any_identifier"..."
, andany_identifier'...'
are now reservedsyntax, and no longer tokenize.
quote!{ #a#b }
is no longer accepted.match"..." {}
is no longer accepted.#
,"
, or'
to avoid errors.
Details
To make space for new syntax in the future, we've decided to reserve syntax for prefixed identifiers and literals:
prefix#identifier
,prefix"string"
,prefix'c'
, andprefix#123
, whereprefix
can be any identifier. (Except those prefixes that already have a meaning, such asb'...'
(byte strings) andr"..."
(raw strings).)This provides syntax we can expand into in the future without requiring an edition boundary. We may use this for temporary syntax until the next edition, or for permanent syntax if appropriate.
Without an edition, this would be a breaking change, since macros can currently accept syntax such as
hello"world"
, which they will see as two separate tokens:hello
and"world"
. The (automatic) fix is simple though: just insert a space:hello "world"
. Likewise,prefix#ident
should becomeprefix #ident
. Edition migrations will help with this fix.Other than turning these into a tokenization error, the RFC does not attach a meaning to any prefix yet. Assigning meaning to specific prefixes is left to future proposals, which will now—thanks to reserving these prefixes—not be breaking changes.
Some new prefixes you might potentially see in the future (though we haven't
committed to any of them yet):
k#keyword
to allow writing keywords that don't exist yet in the current edition. For example, whileasync
is not a keyword in edition 2015, this prefix would've allowed us to acceptk#async
in edition 2015 without having to wait for edition 2018 to reserveasync
as a keyword.f""
as a short-hand for a format string. For example,f"hello {name}"
as a short-hand for the equivalentformat!()
invocation.s""
forString
literals.c""
orz""
for null-terminated C strings.How unresolved questions were resolved and other interesting developments
Where and how to enforce prefixes
The biggest question was where to enforce the prefixes and emit errors. We ultimately opted to emit errors in the lexer, which meant that the lexer had to become aware of the current edition. There was an alternative of using "jointness" and enforcing the conditions in the parser. The idea was to leverage the fact that Rust tokens (at least some subset of them) record not only their content but whether they are separated by whitespace from the next token. This was intended to enable compound operators like
<<
to be parsed as two<
tokens in some parts øf the parser (types) and as a single token elsewhere (expressions), without the lexer having to know what state the parser was in. This same approach could conceptually be used so that the lexer doesn't have to know the edition.As described in detail in this writeup, however, the jointness approach had several downsides. For example, it meant that lexing of literals was independent of prefix: we might like
f"{foo("bar")}"
to be lexed a a string, but that is not possible unless the lexer knows that anf
string can contain embedded expressions. Similarly, which escape codes the lexer accepts depends on the prefix (e.g. \x for b""). (This is especially relevant for raw strings: whetherfr"\"
is accepted or not depends on what meaning we assign tofr
.) Jointness also had forwards compatbility hazards with macro arm ordering. Finally, the lexer-based approach can be converted to a jointness-based approach later, as it currently gives errors much earlier in the process.There were also advantages to jointness: it would allow more procedural macro prototyping, and it means that the lexer would remain independent of edition.
Edition used for procedural macro APIs
There are some procedural macro APIs that lex tokens from strings. Those APIs have not traditionally taken a span or other information from which an edition can be derived. Those APIs will be documented with the Edition that they use to do lexing. In the future we may wish to add new APIs that take a Span or other parameter and use that to derive the Edition.
The text was updated successfully, but these errors were encountered: