-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::io: retry write operation on ErrorKind::WouldBlock #100594
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon. Please see the contribution instructions for more information. |
If writing to tty when O_NONBLOCK bit is set and the lock to tty is held (e.g. multiple threads writing to console), the Linux kernel returns -EAGAIN. (https://elixir.bootlin.com/linux/v5.19/source/drivers/tty/tty_io.c#L952) Not accounting for -EAGAIN (i.e. ErrorKind::WouldBlock), results in a panic. Its better to retry the write as the other thread could have release the lock by then. Signed-off-by: Usama Arif <usama.arif@bytedance.com> Reviewed-by: Fam Zheng <fam.zheng@bytedance.com>
601ca6d
to
eb44672
Compare
Retry on write for EAGAIN is done in other repositories as well, like https://github.com/qemu/qemu/blob/d102b8162a1e5fe8288d4d5c01801ce6536ac2d1/chardev/char.c#L122 |
This is a bad policy in generic code that can talk to any kind of file descriptor because the file (which may also be a socket or pipe) may have been opened with O_NONBLOCK which would then result in a busy loop burning CPU cycles. The correct solution for non-blocking file descriptors is to use polling (e.g. with mio's Poll) or to switch them back into blocking mode. std::io only supports blocking IO. The qemu code seems to be about special files so it can probably make additional assumptions that generic code can't. |
Hi, Thanks for the review. Agreed, it makes sense not to make this change in generic code as it will be used by other files as well and not just stdout/stderr. I tried a simple test code in C++ and rust, to just change the stdout to O_NONBLOCK and then create 10 threads to print to stdout, C++ is able to print and finish to completion, but rust panics and crashes, so I think glibc probably has similar support for O_NONBLOCK for stdout. Just for context, in terms of a real life usecase, qemu with nested virtualization sets stdout to NONBLOCK and if there is another thread with rust running in background that prints it causes the thread to crash if the lock to tty was already held (C++ program wont crash). The commit in this PR is modifying generic code so it isnt correct, but maybe there should be something done specifically for stdout/stderr? |
I think the solution would be to set up polling as fallback in the platformspecific file-descriptor wrapping types when a wouldblock is encountered. But I think it requires a bit of discussion whether we want to support this at all. Can you open an issue describing your problem? |
Thanks, I have opened the issue #100673 |
The fact that println! can be working with any kind of fd, including O_NONBLOCK stands too. It may be passed from parent process with a O_NONBLOCK mode, or, in @uarif1 's case, a child process inherited our stdout fd and set it to O_NONBLOCK any moment later. So fixing the mode doesn't solve the problem, as it's out of control; or it could be that O_NONBLOCK is intended by the developer. For example if the main project is designed to set and use stdout in NON_BLOCKING for reasons, any linked library crate that calls To avoid unnecessary busy looping, maybe we can embed a small ppoll() in write_all() if O_NONBLOCK is set (can be tested with fcntl())? That way this function can block until the fd is ready. |
I'm going to go ahead and close this PR per #100673 (comment) - we're happy to add some documentation about not supporting this, but that's probably best in a separate PR. I don't think it makes sense for std to have ppoll or similar in write_all -- we're still papering over a fundamental difference, and having nested event loops (e.g., if you're setting up nonblocking I/O so that you can use stdin/stdout in a context with a more general event loop on poll) is also a bad idea. |
If writing to tty when O_NONBLOCK bit is set and the lock to
tty is held (e.g. multiple threads writing to console), the Linux kernel
returns -EAGAIN.
(https://elixir.bootlin.com/linux/v5.19/source/drivers/tty/tty_io.c#L952)
Not accounting for -EAGAIN (i.e. ErrorKind::WouldBlock), results in
a panic. Its better to retry the write as the other thread could have
release the lock by then.
Signed-off-by: Usama Arif usama.arif@bytedance.com