Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastest way to initialize a vector is not documented #54628

Closed
Shnatsel opened this issue Sep 28, 2018 · 12 comments
Closed

Fastest way to initialize a vector is not documented #54628

Shnatsel opened this issue Sep 28, 2018 · 12 comments
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools

Comments

@Shnatsel
Copy link
Member

There are at least thee distinct ways to create a zero-filled vector with a certain capacity:

// resize
let mut vec1 = Vec::with_capacity(len);
vec1.resize(len, 0);
// extend
let mut vec2 = Vec::with_capacity(len);
vec2.extend(repeat(0).take(len))
// vec! macro
let mut vec3 = vec![0; len];

Despite the latter being the most concise one, other solutions also show up in real-world code. The performance characteristics of these solutions are not obvious, and not documented - at least on the Vec page in stdlib reference.

This has led to an actual security vulnerability in Claxon crate, now known as RUSTSEC-2018-0004. On some malformed inputs contents of uninitialized memory would show up in the output. See the original bug report or the security advisory for more details.

Like most binary format decoders, Claxon writes into a preallocated buffer. Memory unsafety that led to the vulnerability was introduced to speed up initialization of the vector. Initialization was originally performed like this: buffer.extend(repeat(0).take(new_len - len)), but was replaced with unsafe { buffer.set_len(new_len); } for performance (see relevant commit).

Anything that desugars into RawVec::with_capacity_zeroed() is dramatically more efficient than .extend(), at least on Linux. One public wrapper for this function is vec! macro. Replacing vec::with_capacity(new_len); unsafe { buffer.set_len(new_len); } with vec![0; new_len]; created no measurable performance difference in Claxon, eliminating the need to unsafe code.

I believe that clearly documenting that the vec! macro has special behavior when given 0 as argument and is dramatically more efficient than other means of initialization (at least on some platforms) would have prevented this vulnerability.

@Speedy37
Copy link

Shouldn't vec1.resize(len, 0) be as fast as vec! if vec1.len() is 0 in order to provide consistent performance ?
Probably by optimizing Vec::extend_with directly.

@Shnatsel
Copy link
Member Author

I do not believe that's possible, since vec![0; len]; requests zeroed memory from the OS at the allocation time. So once you've called Vec::with_capacity(len); you cannot go back to the fast path. I'd love to be proven wrong though.

@futile
Copy link
Contributor

futile commented Sep 28, 2018

with_capacity + set_len -> vec![0, len] also sounds like a possible lint for clippy.
Edit: actually all of the slower variants could be lints (in addition to being better documented)

@Havvy Havvy added the A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools label Sep 28, 2018
@ishitatsuyuki
Copy link
Contributor

Clippy thread rust-lang/rust-clippy#3237 opened.

@Speedy37
Copy link

Vec::new() then vec.resize(len, 0) can be fast though.

@mandeep
Copy link
Contributor

mandeep commented Oct 4, 2018

I came across a similar issue today about the most efficient way to initialize a vector so I'm happy I ran into this. I'd be happy to submit a PR with a couple of doc comments in vec.rs that describes the situation if that's okay.

pietroalbini added a commit to pietroalbini/rust that referenced this issue Oct 9, 2018
Add doc comments about safest way to initialize a vector of zeros

This adds more information about the vec! macro as discussed in rust-lang#54628. I think this is a good starting point, but I think additional detail is needed so that we can explain why vec! is safer than the alternatives.
pietroalbini added a commit to pietroalbini/rust that referenced this issue Oct 10, 2018
Add doc comments about safest way to initialize a vector of zeros

This adds more information about the vec! macro as discussed in rust-lang#54628. I think this is a good starting point, but I think additional detail is needed so that we can explain why vec! is safer than the alternatives.
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this issue Oct 11, 2018
Add doc comments about safest way to initialize a vector of zeros

This adds more information about the vec! macro as discussed in rust-lang#54628. I think this is a good starting point, but I think additional detail is needed so that we can explain why vec! is safer than the alternatives.
kennytm added a commit to kennytm/rust that referenced this issue Oct 12, 2018
Add doc comments about safest way to initialize a vector of zeros

This adds more information about the vec! macro as discussed in rust-lang#54628. I think this is a good starting point, but I think additional detail is needed so that we can explain why vec! is safer than the alternatives.
@Shnatsel
Copy link
Member Author

The linked PR has been merged. Should I close this issue?

@Pzixel
Copy link

Pzixel commented May 20, 2019

@Shnatsel How do I init vector with some zeroed newtype? For example, std::num::Wrapping(u32)?

@steveklabnik
Copy link
Member

@Pzixel please ask questions on users.rust-lang.org, rather than on closed bugs. Thanks!

@Pzixel
Copy link

Pzixel commented May 20, 2019

I meant this one introduces special case or 0 but it doesn't work if zero is wrapped in a newtype, even standard one. I didn't want to ask suggestion or something, just wanted to clarify that this issue wasn't completely resolved.

But now I see you think it's an inappropriate place to suggest this so I'm going to the urlo. Thanks for clarification.

@ishitatsuyuki
Copy link
Contributor

IMO asking for clarification in this case is fine, although you probably want to avoid mentioning someone particularly.

The special case for zero is implemented with specialization of IsZero private trait, which is implemented for a lot of numeric types but unfortunately not the wrapping types.

impl<T: Clone + IsZero> SpecFromElem for T {

@Pzixel
Copy link

Pzixel commented May 20, 2019

@ishitatsuyuki great! Gonna create an issue (or even PR) then. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools
Projects
None yet
Development

No branches or pull requests

9 participants