Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid allocations and copying in Vec::leak #89337

Merged
merged 4 commits into from
Oct 15, 2021
Merged

Conversation

mbrubeck
Copy link
Contributor

@mbrubeck mbrubeck commented Sep 28, 2021

The Vec::leak method (#62195) is currently implemented by calling Vec::into_boxed_slice and Box::leak. This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.

By avoiding the conversion to Box, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the Vec. Users who want to shrink the vector first can still do so by calling shrink_to_fit explicitly.

Note: This could break code that uses Box::from_raw to “un-leak” the slice returned by Vec::leak. However, the Vec::leak docs explicitly forbid this, so such code is already incorrect.

Don't shrink the Vec (by calling into_boxed_slice) before leaking it.
@mbrubeck mbrubeck added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 28, 2021
@rust-highfive
Copy link
Collaborator

r? @dtolnay

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 28, 2021
@mbrubeck
Copy link
Contributor Author

This was discussed previously at: rust-lang/rfcs#2969 (review)

@cuviper
Copy link
Member

cuviper commented Sep 28, 2021

Note: This could break code that uses Box::from_raw to “un-leak” the slice returned by Vec::leak. However, the Vec::leak docs explicitly forbid this, so such code is already incorrect.

That clause was added in #77709, but saying "there is no way" is a pretty casual way to forbid this IMO.

I think we should make an explicit guarantee about the capacity here, so there would be a way to use Vec::from_raw_parts. Either we do shrink and you can use len == capacity, or we don't shrink and you can call capacity() beforehand. If we don't shrink, the point about data movement is also a useful performance guarantee to document.

@mbrubeck
Copy link
Contributor Author

mbrubeck commented Sep 28, 2021

I pushed a new commit updating the docs to say:

This method does not reallocate or shrink the Vec, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates the Vec (for example, by calling Vec::from_raw_parts) must keep track of the original capacity.

(This replaces the paragraph that said “there is no way to recover the leaked memory.“)

I don't know if we need to call it out in the docs, but of course anyone who wants the len == capacity guarantee can get it by calling shrink_to_fit before leak.

@cuviper
Copy link
Member

cuviper commented Sep 29, 2021

I'm happy with that, but @rust-lang/libs-api should discuss that as a new API guarantee.

@cuviper cuviper added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Sep 29, 2021
@dtolnay dtolnay removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 29, 2021
@dtolnay
Copy link
Member

dtolnay commented Sep 29, 2021

I was not involved in the discussion over at rust-lang/rfcs#2969, but I think this is pretty clearly the right behavior, for the reasons described by @m-ou-se in rust-lang/rfcs#2969 (review).

  • Using "leak" to mean "leak A COPY OF, but destroy the argument" rather than "leak the argument" is uncharacteristic.

  • If leak does not shrink, getting shrink behavior is as simple as vec.shrink_to_fit(); vec.leak(), or Box::leak(vec.into_boxed_slice()). If leak does shrink, getting the non-shrinking behavior requires the user to write unsafe code.

I'll add one more:

  • If the user has built a data structure containing pointers into the vector, it's reasonable to want to be able to leak without all those pointers getting invalidated.

@rfcbot fcp merge

@rfcbot
Copy link

rfcbot commented Sep 29, 2021

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Sep 29, 2021
@rfcbot
Copy link

rfcbot commented Sep 29, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Sep 29, 2021
@jplatte
Copy link
Contributor

jplatte commented Sep 30, 2021

I pushed a new commit updating the docs to say:

This method does not reallocate or shrink the Vec, so the leaked allocation may include unused capacity that is not part of the returned slice. Unsafe code that later reconstructs or deallocates the Vec (for example, by calling Vec::from_raw_parts) must keep track of the original capacity.

(This replaces the paragraph that said “there is no way to recover the leaked memory.“)

Maybe that should be prefaced by "As of "? IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.

@mbrubeck
Copy link
Contributor Author

IME it's not that uncommon to look at the "regular" stable docs but use an older compiler / std.

Hmm, it's even worse than that. Even if the author of the code is using the most recent toolchain, there's no easy way to prevent users of the code from compiling it with an older toolchain. For example, if a published library contains unsafe code that relies on this guarantee, it would have undefined behavior when any project using the library is built with an older toolchain, where the guarantee does not hold.

This seems bad enough that we might want to avoid making any guarantee at all that doesn't apply across toolchain versions. But if we do document this guarantee, I agree that we need to document when it changed.

@jplatte
Copy link
Contributor

jplatte commented Sep 30, 2021

Well, once 1.56 lands, people will be able to prevent compilation using an older rustc, at least in Cargo projects, through rust-version in Cargo.toml :)

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Oct 9, 2021
@rfcbot
Copy link

rfcbot commented Oct 9, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@RustyYato
Copy link
Contributor

Note: if the len != cap then the leaked slice doesn't have provenance over the unused capacity. So reconstructing via from_raw_parts is unsound. This should be explicitly documented.

cc @RalfJung

@m-ou-se m-ou-se added the relnotes Marks issues that should be documented in the release notes of the next release. label Oct 12, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Oct 12, 2021

Note: if the len != cap then the leaked slice doesn't have provenance over the unused capacity. So reconstructing via from_raw_parts is unsound. This should be explicitly documented.

For now I've removed the note that suggests using from_raw_parts to reconstruct a leaked Vec is okay.

Detailed documentation on whether/when/how a leaked Vec can be 'unleaked' can be added later.

@m-ou-se
Copy link
Member

m-ou-se commented Oct 12, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Oct 12, 2021

📌 Commit df15b28 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 12, 2021
the8472 added a commit to the8472/rust that referenced this pull request Oct 13, 2021
Avoid allocations and copying in Vec::leak

The [`Vec::leak`] method (rust-lang#62195) is currently implemented by calling `Vec::into_boxed_slice` and `Box::leak`.  This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.

By avoiding the conversion to `Box`, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the `Vec`.  Users who *want* to shrink the vector first can still do so by calling `shrink_to_fit` explicitly.

**Note:**  This could break code that uses `Box::from_raw` to “un-leak” the slice returned by `Vec::leak`.  However, the `Vec::leak` docs explicitly forbid this, so such code is already incorrect.

[`Vec::leak`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.leak
@matthiaskrgr
Copy link
Member

@bors rollup=never likely has some perf effects

@bors
Copy link
Contributor

bors commented Oct 15, 2021

⌛ Testing commit df15b28 with merge 265fef4...

@bors
Copy link
Contributor

bors commented Oct 15, 2021

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing 265fef4 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 15, 2021
@bors bors merged commit 265fef4 into rust-lang:master Oct 15, 2021
@rustbot rustbot added this to the 1.57.0 milestone Oct 15, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (265fef4): comparison url.

Summary: This benchmark run did not return any relevant changes.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Oct 21, 2021
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Dec 6, 2021
Pkgsrc changes:
 * Adapt a couple of patches

Upstream changes:

Version 1.57.0 (2021-12-02)
==========================

Language
--------

- [Macro attributes may follow `#[derive]` and will see the original
  (pre-`cfg`) input.][87220]
- [Accept curly-brace macros in expressions, like `m!{ .. }.method()`
  and `m!{ .. }?`.][88690]
- [Allow panicking in constant evaluation.][89508]

Compiler
--------

- [Create more accurate debuginfo for vtables.][89597]
- [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529]
- [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952]
- [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321]
- [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`,
  `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf`

\* Refer to Rust's [platform support page][platform-support-doc] for more
   information on Rust's tiered platform support.

Libraries
---------

- [Avoid allocations and copying in `Vec::leak`][89337]
- [Add `#[repr(i8)]` to `Ordering`][89507]
- [Optimize `File::read_to_end` and `read_to_string`][89582]
- [Update to Unicode 14.0][89614]
- [Many more functions are marked `#[must_use]`][89692], producing a warning
  when ignoring their return value. This helps catch mistakes such as expecting
  a function to mutate a value in place rather than return a new value.

Stabilised APIs
---------------

- [`[T; N]::as_mut_slice`][`array::as_mut_slice`]
- [`[T; N]::as_slice`][`array::as_slice`]
- [`collections::TryReserveError`]
- [`HashMap::try_reserve`]
- [`HashSet::try_reserve`]
- [`String::try_reserve`]
- [`String::try_reserve_exact`]
- [`Vec::try_reserve`]
- [`Vec::try_reserve_exact`]
- [`VecDeque::try_reserve`]
- [`VecDeque::try_reserve_exact`]
- [`Iterator::map_while`]
- [`iter::MapWhile`]
- [`proc_macro::is_available`]
- [`Command::get_program`]
- [`Command::get_args`]
- [`Command::get_envs`]
- [`Command::get_current_dir`]
- [`CommandArgs`]
- [`CommandEnvs`]

These APIs are now usable in const contexts:

- [`hint::unreachable_unchecked`]

Cargo
-----

- [Stabilize custom profiles][cargo/9943]

Compatibility notes
-------------------

Internal changes
----------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of rustc
and related tools.

- [Added an experimental backend for codegen with `libgccjit`.][87260]

[86191]: rust-lang/rust#86191
[87220]: rust-lang/rust#87220
[87260]: rust-lang/rust#87260
[88243]: rust-lang/rust#88243
[88321]: rust-lang/rust#88321
[88529]: rust-lang/rust#88529
[88690]: rust-lang/rust#88690
[88952]: rust-lang/rust#88952
[89337]: rust-lang/rust#89337
[89507]: rust-lang/rust#89507
[89508]: rust-lang/rust#89508
[89582]: rust-lang/rust#89582
[89597]: rust-lang/rust#89597
[89614]: rust-lang/rust#89614
[89692]: rust-lang/rust#89692
[cargo/9943]: rust-lang/cargo#9943
[`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice
[`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice
[`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html
[`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve
[`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve
[`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve
[`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact
[`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve
[`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact
[`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve
[`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact
[`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while
[`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html
[`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html
[`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program
[`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args
[`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs
[`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir
[`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html
[`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 22, 2022
Pkgsrc changes:
 * Adjust line numbers in a number of patches
 * remove the --disable-dist-src option, so that we produce
   the rust-src rust component, which we upload to LOCALSRC
   to allow the rust-src package to build, which is needed
   for rust-analyzer.
 * Cargo checksum for vendor/cc no longer needs patching;
   checksum for vendor/libc updated

Upstream changes:

Version 1.57.0 (2021-12-02)
==========================

Language
--------

- [Macro attributes may follow `#[derive]` and will see the original
  (pre-`cfg`) input.][87220]
- [Accept curly-brace macros in expressions, like `m!{ .. }.method()`
  and `m!{ .. }?`.][88690]
- [Allow panicking in constant evaluation.][89508]

Compiler
--------

- [Create more accurate debuginfo for vtables.][89597]
- [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529]
- [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952]
- [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321]
- [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`,
  `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf`

\* Refer to Rust's [platform support page][platform-support-doc] for more
   information on Rust's tiered platform support.

Libraries
---------

- [Avoid allocations and copying in `Vec::leak`][89337]
- [Add `#[repr(i8)]` to `Ordering`][89507]
- [Optimize `File::read_to_end` and `read_to_string`][89582]
- [Update to Unicode 14.0][89614]
- [Many more functions are marked `#[must_use]`][89692], producing a warning
  when ignoring their return value. This helps catch mistakes such as expecting
  a function to mutate a value in place rather than return a new value.

Stabilised APIs
---------------

- [`[T; N]::as_mut_slice`][`array::as_mut_slice`]
- [`[T; N]::as_slice`][`array::as_slice`]
- [`collections::TryReserveError`]
- [`HashMap::try_reserve`]
- [`HashSet::try_reserve`]
- [`String::try_reserve`]
- [`String::try_reserve_exact`]
- [`Vec::try_reserve`]
- [`Vec::try_reserve_exact`]
- [`VecDeque::try_reserve`]
- [`VecDeque::try_reserve_exact`]
- [`Iterator::map_while`]
- [`iter::MapWhile`]
- [`proc_macro::is_available`]
- [`Command::get_program`]
- [`Command::get_args`]
- [`Command::get_envs`]
- [`Command::get_current_dir`]
- [`CommandArgs`]
- [`CommandEnvs`]

These APIs are now usable in const contexts:

- [`hint::unreachable_unchecked`]

Cargo
-----

- [Stabilize custom profiles][cargo/9943]

Compatibility notes
-------------------

Internal changes
----------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of rustc
and related tools.

- [Added an experimental backend for codegen with `libgccjit`.][87260]

[86191]: rust-lang/rust#86191
[87220]: rust-lang/rust#87220
[87260]: rust-lang/rust#87260
[88243]: rust-lang/rust#88243
[88321]: rust-lang/rust#88321
[88529]: rust-lang/rust#88529
[88690]: rust-lang/rust#88690
[88952]: rust-lang/rust#88952
[89337]: rust-lang/rust#89337
[89507]: rust-lang/rust#89507
[89508]: rust-lang/rust#89508
[89582]: rust-lang/rust#89582
[89597]: rust-lang/rust#89597
[89614]: rust-lang/rust#89614
[89692]: rust-lang/rust#89692
[cargo/9943]: rust-lang/cargo#9943
[`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice
[`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice
[`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html
[`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve
[`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve
[`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve
[`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact
[`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve
[`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact
[`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve
[`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact
[`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while
[`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html
[`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html
[`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program
[`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args
[`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs
[`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir
[`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html
[`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.