-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement full to_{upper,lower}case algorithms #25800
Comments
A couple friends and I dug into this issue as part of a mini-sprint. We'd love a mentor or some guidance to work on this. As we understand it right now:
So what needs to happen then is:
Is that an accurate summary? Also, we noted there are several external but related crates for unicode but couldn't find any indication on whether these crates were moving into the standard lib or whether the standard lib was moving out. |
@bitborn that all sounds pretty good! I think we may not want to encode the exact return value of each character (that's a lot of space). It'll be a balancing act to figure out how to encode the data on unicode.org in as compact a form as possible but still having a fast lookup for case conversions. |
* Add “complex” mappings to `char::to_lowercase` and `char::to_uppercase`, making them yield sometimes more than on `char`: #25800. `str::to_lowercase` and `str::to_uppercase` are affected as well. * Add `char::to_titlecase`, since it’s the same algorithm (just different data). However this does **not** add `str::to_titlecase`, as that would require UAX#29 Unicode Text Segmentation which we decided not to include in of `std`: rust-lang/rfcs#1054 I made `char::to_titlecase` immediately `#[stable]`, since it’s so similar to `char::to_uppercase` that’s already stable. Let me know if it should be `#[unstable]` for a while. * Add a special case for upper-case Sigma in word-final position in `str::to_lowercase`: #26035. This is the only language-independent conditional mapping currently in `SpecialCasing.txt`. * Stabilize `str::to_lowercase` and `str::to_uppercase`. The `&self -> String` on `str` signature seems straightforward enough, and the only relevant issue I’ve found is #24536 about naming. But `char` already has stable methods with the same name, and deprecating them for a rename doesn’t seem worth it. r? @alexcrichton
Right now we always return an iterator over one character, but the iterator is being returned so one day we can return many characters. This has all yet to be implemented, and this issue will track this implementation.
The text was updated successfully, but these errors were encountered: