Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up read_to_end helper function using buffer capacity for read size #35844

Conversation

Mark-Simulacrum
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Aug 20, 2016

The buffer's capacity or 16 bytes, whichever is larger, is the minimum start read size, but it will get twice as large with each read.

Fixes #35823.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

The buffer's capacity or 16 bytes, whichever is larger, is the minimum
start read size, but it will get twice as large with each read.
@Mark-Simulacrum Mark-Simulacrum changed the title Call Read's read_to_end method instead of the helper function. Speed up read_to_end helper function using buffer capacity for read size Aug 20, 2016
@alexcrichton
Copy link
Member

Thanks for the PR! This is a slightly different fix than what I had in mind, though. I think that calling self.read_to_end here will be good to do in any case, and then this would just be an optimization on top of that?

In the specific case of file.read_to_string this current change won't be necessary if we call self.read_to_end in that location I believe.

@Mark-Simulacrum
Copy link
Member Author

@alexcrichton The comment just above that line says that calling self.read_to_end won't be safe due to potential altering of more than just the end of the buffer (and as such introducing non-UTF-8 bytes into the String).

@alexcrichton
Copy link
Member

Bah, right! I believe though we could implement read_to_string on types like File as well as we know the implementation won't frob the existing contents of the buffer.

I'm somewhat hesitant to change the behavior of read_to_end as it's been pretty carefully optimized over time, so I'm basically just wondering if we can get the same benefits through some methods on File directly for now.

@Mark-Simulacrum
Copy link
Member Author

Mark-Simulacrum commented Aug 21, 2016

With my changes, these are the results for a file with 20971520 bytes. The alloc variant calls String::new(), the other is String::with_capacity(20971520).

test read_large_file       ... bench:  48,656,818 ns/iter (+/- 132,591)
test read_large_file_alloc ... bench:  32,471,488 ns/iter (+/- 47,191)

With the old code:

test read_large_file       ... bench:  17,332,842 ns/iter (+/- 20,665)
test read_large_file_alloc ... bench:  17,593,405 ns/iter (+/- 28,095)

@alexcrichton
Copy link
Member

@Mark-Simulacrum if I'm reading that right, it looks like the changes here are 2x slower?

@Mark-Simulacrum
Copy link
Member Author

Nearly 3x; but yes. I'm thinking we should close since I can't see a clear path forward; though I'd be happy to try and implement something different if it's suggested. Perhaps the idea of special casing File and a few other structs' Read::read_to_string methods is worth looking into.

@alexcrichton
Copy link
Member

Ok, sounds good to me. And yeah if you want to explore that route seems good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants