Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document valid values of the char type #93493

Merged
merged 3 commits into from
Feb 2, 2022
Merged

Conversation

GKFX
Copy link
Contributor

@GKFX GKFX commented Jan 30, 2022

As discussed at #93392, the current documentation on what constitutes a valid char isn't very detailed and is partly on the MAX constant rather than the type itself.

This PR expands on that information, stating the actual numerical range, giving examples of what won't work, and also mentions how a char might be a valid USV but still not be a defined character (terminology checked against Unicode 14.0, table 2-3).

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @m-ou-se (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 30, 2022
Comment on lines 311 to 314
/// Unicode is regularly updated. Many USVs are not currently assigned to a
/// character, but may be in the future ("reserved"); some will never be a character
/// ("noncharacters"); and some may be given different meanings by different users
/// ("private use").
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the wording of this paragraph. If I didn't know otherwise, I might assume from the context that "reserved", "noncharacters" and "private use" were also currently invalid for char (but may become valid later).

@scottmcm
Copy link
Member

Idea: USVs are also the exact set of things that are legal to encode in UTF-8 (and thus to have in strs) too, right? That might be a nice thing to put somewhere to help motivate the restrictions.

Suggestion: mention that this validity restriction means you don't have to match the gap in a match. Maybe show an example like this, which compiles:

pub fn demo(c: char) -> bool {
    match c {
        '\0' ..= '\u{D7FF}' => false,
        '\u{E000}' ..= '\u{10FFFF}' => true,
    }
}

@rust-log-analyzer

This comment has been minimized.

@scottmcm
Copy link
Member

scottmcm commented Feb 2, 2022

Thanks! This is a big improvement.

r? @scottmcm
@bors r+

@bors
Copy link
Contributor

bors commented Feb 2, 2022

📌 Commit d372baf has been approved by scottmcm

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 2, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 2, 2022
Document valid values of the char type

As discussed at rust-lang#93392, the current documentation on what constitutes a valid char isn't very detailed and is partly on the MAX constant rather than the type itself.

This PR expands on that information, stating the actual numerical range, giving examples of what won't work, and also mentions how a `char` might be a valid USV but still not be a defined character (terminology checked against [Unicode 14.0, table 2-3](https://www.unicode.org/versions/Unicode14.0.0/ch02.pdf#M9.61673.TableTitle.Table.22.Types.of.Code.Points)).
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 2, 2022
…askrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#92758 (librustdoc: impl core::fmt::Write for rustdoc::html::render::Buffer)
 - rust-lang#92788 (Detect `::` -> `:` typo in type argument)
 - rust-lang#93420 (Improve wrapping on settings page)
 - rust-lang#93493 (Document valid values of the char type)
 - rust-lang#93531 (Fix incorrect panic message in example)
 - rust-lang#93559 (Add missing | between print options)
 - rust-lang#93560 (Fix two incorrect "it's" (typos in comments))

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit a3deca4 into rust-lang:master Feb 2, 2022
@rustbot rustbot added this to the 1.60.0 milestone Feb 2, 2022
@GKFX GKFX deleted the char-docs-2 branch January 2, 2024 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants