Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_to_end rezeroes the internal buffer many times #2544

Closed
talchas opened this issue May 18, 2020 · 9 comments · Fixed by #2560
Closed

read_to_end rezeroes the internal buffer many times #2544

talchas opened this issue May 18, 2020 · 9 comments · Fixed by #2560
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. I-slow Problems and improvements related to performance. M-io Module: tokio/io T-performance Topic: performance and benchmarks

Comments

@talchas
Copy link

talchas commented May 18, 2020

Version

tokio master f480659
tokio-macros 0.2.5

Platform

Linux 4.19.82-gentoo x86_64

Description

read_to_end / ReadToEnd::poll / read_to_end_internal don't keep track of what part of the unused Vec has been initialized across poll() calls, leading to tons of unnecessary zeroing. (The code adapted from std's read_to_end keeps track of this, but only within a single poll() call)

The original repro from a discord question for linux (create a ~200 mb file in /dev/shm/input.txt):

use futures::stream::{StreamExt, FuturesUnordered};
use tokio::process::{Command};
use std::fs::File;

async fn cat(file_name: &str) -> Result<std::process::Output, std::io::Error> {
    let file = File::open(file_name)?;
    let handle = Command::new("cat")
        .arg("-")
        .kill_on_drop(true)
        .stdin(file)
        .output();

    return handle.await;
}

const N: usize = 4;

#[tokio::main]
async fn main() -> Result<(), std::io::Error> {
    let mut children = FuturesUnordered::new();

    for i in 0..N {
        println!("Spawning child number {}", i);
        children.push(cat("/dev/shm/input.txt"));
    }

    while let Some(ret) = children.next().await {
        match ret {
            Err(e) => println!("{:?}", e),
            Ok(child) => println!("Child exited with status {}", child.status),
        }
    }

    Ok(())
}

A hacked up patch to keep an extra initialized_len in ReadToEnd and keep it and g.buf.len() in sync while inside of read_to_end_internal fixes this and cuts the runtime from 20s to 0.5s for me.

(A side issue from the original complaint is that ChildStdout/ChildStderr could use uninitialized buffers, which looks to be being fixed by #2525)

@Darksonn Darksonn added A-tokio Area: The main tokio crate C-bug Category: This is a bug. I-slow Problems and improvements related to performance. M-io Module: tokio/io T-performance Topic: performance and benchmarks labels May 18, 2020
@Soveu
Copy link

Soveu commented May 18, 2020

Changing tokio version from 0.2.21 to branch from PR #2525 can drastically improve performance.
In case where /dev/shm/input.txt has 2GB of data, 0.2.21 does not finish in 16 minutes, while version from this branch finishes in 4 seconds.
Edit: Platform: Fedora 32 (linux 5.6.12) x86_64

@talchas
Copy link
Author

talchas commented May 18, 2020

Yeah, that makes ChildStdout/err not initialize the buffer at all, which avoids the issue in read_to_end entirely, for this particular repro.

@taiki-e
Copy link
Member

taiki-e commented May 18, 2020

A hacked up patch to keep an extra initialized_len in ReadToEnd and keep it and g.buf.len() in sync while inside of read_to_end_internal fixes this and cuts the runtime from 20s to 0.5s for me.

Does this patch take into consideration for Guard to update the length of g.buf each time Pending is returned? (IIUC, otherwise, uninitialized memory may be mistakenly assumed as being initialized)

impl Drop for Guard<'_> {
fn drop(&mut self) {
unsafe {
self.buf.set_len(self.len);
}
}
}

@talchas
Copy link
Author

talchas commented May 18, 2020

It's just

diff --git a/tokio/src/io/util/read_to_end.rs b/tokio/src/io/util/read_to_end.rs
index a2cd99be..477b76e5 100644
--- a/tokio/src/io/util/read_to_end.rs
+++ b/tokio/src/io/util/read_to_end.rs
@@ -54,12 +56,15 @@ pub(super) fn read_to_end_internal<R: AsyncRead + ?Sized>(
     cx: &mut Context<'_>,
     buf: &mut Vec<u8>,
     start_len: usize,
+    init_len: &mut usize,
 ) -> Poll<io::Result<usize>> {
     let mut g = Guard {
         len: buf.len(),
         buf,
     };
     let ret;
+
+    unsafe { g.buf.set_len(*init_len); }
     loop {
         if g.len == g.buf.len() {
             unsafe {
@@ -70,6 +75,7 @@ pub(super) fn read_to_end_internal<R: AsyncRead + ?Sized>(
                 let b = &mut *(&mut g.buf[g.len..] as *mut [u8] as *mut [MaybeUninit<u8>]);
 
                 rd.prepare_uninitialized_buffer(b);
+                *init_len = g.buf.len();
             }
         }

ie saving/restoring the buf.len() state as well, but only within the Guard region. Another way would be to assume + ensure that len()..capacity() has always been prepare_uninitialize_buffer()ed by also putting something in read_to_end() and read_to_string(), in which case read_to_end_internal could just assume that init_len was capacity().

@taiki-e
Copy link
Member

taiki-e commented May 19, 2020

@talchas Thanks. Looks like it's working correctly for the issue I was concerned about.

Also, it seems that actually using in tokio will require some minor fixes.

EDIT: The second thing is only needed for ReadToString that uses read_to_end_internal and replaces buffer with empty Vec after completion.

@talchas
Copy link
Author

talchas commented May 19, 2020

For the first, yeah; for the second I think you're fine because ReadToEnd holds a borrow on the input buffer, and having returned success once shouldn't change anything in terms of what's initialized.

@taiki-e
Copy link
Member

taiki-e commented May 19, 2020

for the second I think you're fine because ReadToEnd holds a borrow on the input buffer, and having returned success once shouldn't change anything in terms of what's initialized.

Oops, sorry. The second thing is only needed for ReadToString that uses read_to_end_internal and replaces buffer with empty Vec after completion.

@talchas
Copy link
Author

talchas commented May 19, 2020

Ah yes, ReadToString would have to be careful there.

Darksonn added a commit to Darksonn/tokio that referenced this issue May 21, 2020
The new implementation changes the behaviour such that set_len is called
after poll_read. The motivation of this change is that it makes it much
more obvious that a rouge panic wont give the caller access to a vector
containing exposed uninitialized memory. The new implementation also
makes sure to not zero memory twice.

Additionally it makes the various implementations more consistent with
each other regarding naming of variables, and whether we store how many
bytes we have read, or how many were in the container originally.

Fixes: tokio-rs#2544
@Darksonn
Copy link
Contributor

This will be fixed by #2541.

Darksonn added a commit to Darksonn/tokio that referenced this issue May 25, 2020
The new implementation changes the behaviour such that set_len is called
after poll_read. The motivation of this change is that it makes it much
more obvious that a rouge panic wont give the caller access to a vector
containing exposed uninitialized memory. The new implementation also
makes sure to not zero memory twice.

Additionally it makes the various implementations more consistent with
each other regarding naming of variables, and whether we store how many
bytes we have read, or how many were in the container originally.

Fixes: tokio-rs#2544
carllerche pushed a commit that referenced this issue Jul 28, 2020
The new implementation changes the behavior such that set_len is called
after poll_read. The motivation of this change is that it makes it much
more obvious that a rouge panic won't give the caller access to a vector
containing exposed uninitialized memory. The new implementation also
makes sure to not zero memory twice.

Additionally, it makes the various implementations more consistent with
each other regarding the naming of variables, and whether we store how many
bytes we have read, or how many were in the container originally.

Fixes: #2544
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. I-slow Problems and improvements related to performance. M-io Module: tokio/io T-performance Topic: performance and benchmarks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants