-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Unicode 14.0 #89614
Update to Unicode 14.0 #89614
Conversation
The "Alphabetic" property in Unicode 14 grew too big for the bitset representation, panicking "cannot pack 264 into 8 bits". However, we were already choosing the skiplist for that anyway, so this doesn't need to be a hard failure. That panic is now a returned `Err`, and then in `emit_codepoints` we automatically defer to skiplist.
(rust-highfive has picked a reviewer for you, use r? to override) |
For comparison, here's the tool output with version 13 data:
And here's with version 14 data:
|
Sorry! I know for next time. Thank you for all the work you're doing upgrading unicode! |
There some unicode crates around, should they be updated too? |
@klensy sure, other crates will have to be updated individually. e.g. |
@bors r+ |
📌 Commit 459a7e3 has been approved by |
Update to Unicode 14.0 The Unicode Standard [announced Version 14.0](https://home.unicode.org/announcing-the-unicode-standard-version-14-0/) on September 14, 2021, and this pull request updates the generated tables in `core` accordingly. This did require a little prep-work in `unicode-table-generator`. First, rust-lang#81358 had modified the generated file instead of the tool, so that change is now reflected in the tool as well. Next, I found that the "Alphabetic" property in version 14 was panicking when generating a bitset, "cannot pack 264 into 8 bits". We've been using the skiplist for that anyway, so I changed this to fail gracefully. Finally, I confirmed that the tool still created the exact same tables for 13 before moving to 14.
…laumeGomez Rollup of 6 pull requests Successful merges: - rust-lang#75644 (Add 'core::array::from_fn' and 'core::array::try_from_fn') - rust-lang#87528 (stack overflow handler specific openbsd change.) - rust-lang#88436 (std: Stabilize command_access) - rust-lang#89614 (Update to Unicode 14.0) - rust-lang#89664 (Add documentation to boxed conversions) - rust-lang#89700 (Fix invalid HTML generation for higher bounds) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Maybe a tidy check should be added that compares the output of the tool to the checked-in file to make sure they're the same? |
Ideally there's some kind of #![autogen_code] metadata and rust analyzer
warns me I am altering autogenerated code.
…On Sat, 9 Oct 2021 at 20:05, bors ***@***.***> wrote:
Merged #89614 <#89614> into master.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#89614 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGEJCAZ5OOBOPJFBGFV4I3UGCG65ANCNFSM5FQDVQ4A>
.
|
@Mark-Simulacrum would mingw-check be a good place to check the generated code? Or maybe a rust-highfive mention? |
That's a good idea; a rust-highfive ping should be quick to set up and sufficient since the file is not changed often. |
Highfive definitely seems like a good idea, someone can file a PR adding that rule somewhere here - https://github.com/rust-lang/highfive/blob/master/highfive/configs/rust-lang/rust.json#L130 I think running the script on each CI to check it's still in sync might be a bit much, though, seems not worth it to me. If we want something more than highfive, I would just fail CI if the file doesn't match a hardcoded hash (e.g., in tidy). Easy to update when needed and fast, as well as potentially more broadly useful (we have several such files, I think). |
Pkgsrc changes: * Adapt a couple of patches Upstream changes: Version 1.57.0 (2021-12-02) ========================== Language -------- - [Macro attributes may follow `#[derive]` and will see the original (pre-`cfg`) input.][87220] - [Accept curly-brace macros in expressions, like `m!{ .. }.method()` and `m!{ .. }?`.][88690] - [Allow panicking in constant evaluation.][89508] Compiler -------- - [Create more accurate debuginfo for vtables.][89597] - [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529] - [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952] - [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321] - [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`, `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf` \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Avoid allocations and copying in `Vec::leak`][89337] - [Add `#[repr(i8)]` to `Ordering`][89507] - [Optimize `File::read_to_end` and `read_to_string`][89582] - [Update to Unicode 14.0][89614] - [Many more functions are marked `#[must_use]`][89692], producing a warning when ignoring their return value. This helps catch mistakes such as expecting a function to mutate a value in place rather than return a new value. Stabilised APIs --------------- - [`[T; N]::as_mut_slice`][`array::as_mut_slice`] - [`[T; N]::as_slice`][`array::as_slice`] - [`collections::TryReserveError`] - [`HashMap::try_reserve`] - [`HashSet::try_reserve`] - [`String::try_reserve`] - [`String::try_reserve_exact`] - [`Vec::try_reserve`] - [`Vec::try_reserve_exact`] - [`VecDeque::try_reserve`] - [`VecDeque::try_reserve_exact`] - [`Iterator::map_while`] - [`iter::MapWhile`] - [`proc_macro::is_available`] - [`Command::get_program`] - [`Command::get_args`] - [`Command::get_envs`] - [`Command::get_current_dir`] - [`CommandArgs`] - [`CommandEnvs`] These APIs are now usable in const contexts: - [`hint::unreachable_unchecked`] Cargo ----- - [Stabilize custom profiles][cargo/9943] Compatibility notes ------------------- Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Added an experimental backend for codegen with `libgccjit`.][87260] [86191]: rust-lang/rust#86191 [87220]: rust-lang/rust#87220 [87260]: rust-lang/rust#87260 [88243]: rust-lang/rust#88243 [88321]: rust-lang/rust#88321 [88529]: rust-lang/rust#88529 [88690]: rust-lang/rust#88690 [88952]: rust-lang/rust#88952 [89337]: rust-lang/rust#89337 [89507]: rust-lang/rust#89507 [89508]: rust-lang/rust#89508 [89582]: rust-lang/rust#89582 [89597]: rust-lang/rust#89597 [89614]: rust-lang/rust#89614 [89692]: rust-lang/rust#89692 [cargo/9943]: rust-lang/cargo#9943 [`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice [`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice [`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html [`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve [`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve [`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve [`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact [`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve [`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact [`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve [`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact [`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while [`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html [`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html [`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program [`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args [`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs [`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir [`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html [`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
Pkgsrc changes: * Adjust line numbers in a number of patches * remove the --disable-dist-src option, so that we produce the rust-src rust component, which we upload to LOCALSRC to allow the rust-src package to build, which is needed for rust-analyzer. * Cargo checksum for vendor/cc no longer needs patching; checksum for vendor/libc updated Upstream changes: Version 1.57.0 (2021-12-02) ========================== Language -------- - [Macro attributes may follow `#[derive]` and will see the original (pre-`cfg`) input.][87220] - [Accept curly-brace macros in expressions, like `m!{ .. }.method()` and `m!{ .. }?`.][88690] - [Allow panicking in constant evaluation.][89508] Compiler -------- - [Create more accurate debuginfo for vtables.][89597] - [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529] - [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952] - [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321] - [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`, `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf` \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Avoid allocations and copying in `Vec::leak`][89337] - [Add `#[repr(i8)]` to `Ordering`][89507] - [Optimize `File::read_to_end` and `read_to_string`][89582] - [Update to Unicode 14.0][89614] - [Many more functions are marked `#[must_use]`][89692], producing a warning when ignoring their return value. This helps catch mistakes such as expecting a function to mutate a value in place rather than return a new value. Stabilised APIs --------------- - [`[T; N]::as_mut_slice`][`array::as_mut_slice`] - [`[T; N]::as_slice`][`array::as_slice`] - [`collections::TryReserveError`] - [`HashMap::try_reserve`] - [`HashSet::try_reserve`] - [`String::try_reserve`] - [`String::try_reserve_exact`] - [`Vec::try_reserve`] - [`Vec::try_reserve_exact`] - [`VecDeque::try_reserve`] - [`VecDeque::try_reserve_exact`] - [`Iterator::map_while`] - [`iter::MapWhile`] - [`proc_macro::is_available`] - [`Command::get_program`] - [`Command::get_args`] - [`Command::get_envs`] - [`Command::get_current_dir`] - [`CommandArgs`] - [`CommandEnvs`] These APIs are now usable in const contexts: - [`hint::unreachable_unchecked`] Cargo ----- - [Stabilize custom profiles][cargo/9943] Compatibility notes ------------------- Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Added an experimental backend for codegen with `libgccjit`.][87260] [86191]: rust-lang/rust#86191 [87220]: rust-lang/rust#87220 [87260]: rust-lang/rust#87260 [88243]: rust-lang/rust#88243 [88321]: rust-lang/rust#88321 [88529]: rust-lang/rust#88529 [88690]: rust-lang/rust#88690 [88952]: rust-lang/rust#88952 [89337]: rust-lang/rust#89337 [89507]: rust-lang/rust#89507 [89508]: rust-lang/rust#89508 [89582]: rust-lang/rust#89582 [89597]: rust-lang/rust#89597 [89614]: rust-lang/rust#89614 [89692]: rust-lang/rust#89692 [cargo/9943]: rust-lang/cargo#9943 [`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice [`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice [`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html [`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve [`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve [`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve [`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact [`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve [`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact [`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve [`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact [`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while [`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html [`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html [`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program [`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args [`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs [`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir [`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html [`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
The Unicode Standard announced Version 14.0 on September 14, 2021, and this pull request updates the generated tables in
core
accordingly.This did require a little prep-work in
unicode-table-generator
. First, #81358 had modified the generated file instead of the tool, so that change is now reflected in the tool as well. Next, I found that the "Alphabetic" property in version 14 was panicking when generating a bitset, "cannot pack 264 into 8 bits". We've been using the skiplist for that anyway, so I changed this to fail gracefully. Finally, I confirmed that the tool still created the exact same tables for 13 before moving to 14.