Lexer:
IDENTIFIER_OR_KEYWORD :
XID_Start XID_Continue*
|_
XID_Continue+RAW_IDENTIFIER :
r#
IDENTIFIER_OR_KEYWORD Exceptcrate
,self
,super
,Self
NON_KEYWORD_IDENTIFIER : IDENTIFIER_OR_KEYWORD Except a strict or reserved keyword
IDENTIFIER :
NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER
Identifiers follow the specification in Unicode Standard Annex #31 for Unicode version 15.0, with the additions described below. Some examples of identifiers:
foo
_identifier
r#true
Москва
東京
The profile used from UAX #31 is:
- Start :=
XID_Start
, plus the underscore character (U+005F) - Continue :=
XID_Continue
- Medial := empty
with the additional constraint that a single underscore character is not an identifier.
Note: Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in
rustc
.
Identifiers may not be a strict or reserved keyword without the r#
prefix described below in raw identifiers.
Zero width non-joiner (ZWNJ U+200C) and zero width joiner (ZWJ U+200D) characters are not allowed in identifiers.
Identifiers are restricted to the ASCII subset of XID_Start
and XID_Continue
in the following situations:
extern crate
declarations- External crate names referenced in a path
- Module names loaded from the filesystem without a
path
attribute no_mangle
attributed items- Item names in external blocks
Identifiers are normalized using Normalization Form C (NFC) as defined in Unicode Standard Annex #15. Two identifiers are equal if their NFC forms are equal.
Procedural and declarative macros receive normalized identifiers in their input.
A raw identifier is like a normal identifier, but prefixed by r#
. (Note that
the r#
prefix is not included as part of the actual identifier.)
Unlike a normal identifier, a raw identifier may be any strict or reserved
keyword except the ones listed above for RAW_IDENTIFIER
.