don't allow U+0387 (·) in identifiers #28167
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As discussed in #25157 and on discourse, it doesn't make sense to accept U+0387 (
·
) in identifiers, despite the fact that it is included in UAX#31 for legacy reasons.Julia NFC-normalizes identifiers (#5434), and U+0387 NFC-normalizes to U+00b7 (middle-dot
·
, i.e.\cdotp
), but we don't allow U+00b7 in identifiers. It makes no sense to treat them differently (the code for this stems from #6805 by @JeffBezanson). At some point in the future, we may want to normalize both to U+22c5 (\cdot
⋅
) (see #25157).The other special cases that we included from Other_ID_Continue are in category No, which we already allow anyway, which is why I deleted the whole Other_ID_Continue line.
This is a breaking change. I seriously doubt that it breaks any real-world code, but we should try to get it in for 0.7. cc @mlhetland.