-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add more explicit I/O safety documentation #114780
Conversation
(rustbot has picked a reviewer for you, use r? to override) |
One interesting point is that the original RFC does not seem to have gone through T-lang approval, even though this is clearly a T-lang question (since it affects all Rust code that links against the standard library, not just the standard library API surface):
So... maybe let's use this PR to have FCP for both teams? |
See also: |
Ah that is where this was hidden... d'oh. I was looking for I/O safety docs in the standard library and could not find any. |
I added some cross-referencing. It still seems worth documenting the notion of I/O safety under its name somewhere outside |
I'm going to nominate this for the libs-api meeting as there has been much discussion (and some confusion) around I/O safety and its impact on the Rust ecosystem. Also it may be worth discussing Ralf's assertion that the original RFC should have had T-lang approval in addition to T-libs-api and how that reasoning might apply to future RFCs. |
This assertion is based on the fact that I/O safety rules out code that doesn't use the standard library API at all, but just uses fn not_okay(i: i32) {
// SAFETY: `close` is never UB, worst-case the FD does not exist and we get EBADF
unsafe { close(i); }
} Such functions actually exist in the wild, and (in my understanding) it was the intention of the RFC to say that as-is these are unsound. That definitely sounds like a T-lang question to me. Also see rust-lang/unsafe-code-guidelines#434, where part of the question was "can the standard library just decide some rules that apply to code that doesn't use (those parts of) the standard library" -- to which IMO the answer is "no"; those are ecosystem-wide questions of the kind typically considered by T-lang. |
library/std/src/io/mod.rs
Outdated
//! | ||
//! To uphold I/O safety, it is crucial that no code closes file descriptors it does not own. In | ||
//! other words, a safe function that takes a regular integer, treats it as a file descriptor, and | ||
//! closes it, is *unsound*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent with I/O safety is a stronger rule than just "only the owner is allowed to close". I also intend "and no one can do any operation, including read, write, or anything else, with a file descriptor that is closed".
Without this stronger rule, it would be impossible to ever encapsulate a file description, because things like dup(rand())
would become permitted, and that would preclude some useful use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this stronger rule, it would be impossible to ever encapsulate a file description, because things like dup(rand()) would become permitted, and that would preclude some useful use cases.
Given that I can dup
a file descriptor that I have merely borrowed, most of these usecases will already not work -- at least they won't work with OwnedFd
. I was hoping we'd have that "you cannot write an FD that you do not own", but we do not have that guarantee with OwnedFd
and it is unclear if there is any way to protect an FD against being written-to by other parts of the code.
The OS will reliably return EBADFD on any operation if the FD is closed, so there's not really any point in disallowing this that I can see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File description encapsulation requires encapsulating the OwnedFd
and not implementing AsFd
or handing out a BorrowedFd
. So it takes some deliberate doing, but it is doable, as long as we disallow things like dup(rand())
and read(rand(), ...)
and so on.
The OS is documented to return EBADF
, however any time you get that error on a closed fd, it only means you got lucky that no code, perhaps on another thread racing with you, did eg. File::open
and reused the fd number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(let's merge this with the other thread about file object encapsulation)
library/std/src/os/unix/io/mod.rs
Outdated
@@ -6,7 +6,7 @@ | |||
//! | |||
//! This module provides three types for representing file descriptors, | |||
//! with different ownership properties: raw, borrowed, and owned, which are | |||
//! analogous to types used for representing pointers: | |||
//! analogous to types used for representing pointers. These types realize the Unix version of [I/O safety]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"realize" might suggest that these types are the extent of I/O safety. However, the intent is that the rules be respected even for code that calls FFI functions using file descriptors, without using these types at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make it say "reflect" instead, which is also the verb I used in the io
docs ("The platform-specific parts of the Rust standard library expose types that reflect these concepts")
This comment was marked as off-topic.
This comment was marked as off-topic.
//! protected against reads or writes from other parts of the program. Whether that is sound is | ||
//! [currently unclear](https://github.com/rust-lang/rust/issues/114167). Certainly, `OwnedFd` as a | ||
//! type does not provide any promise that the underlying file descriptor has not been cloned. | ||
//! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: can we also list some of the risks of violating IO safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally, exhaustively if possible
roughly: if you are accessing an fd after it has been closed, it might have been reallocated to belong to someone else, which might be the allocator, or memory mapping libraries, or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a paragraph. However, without full file description encapsulation, I am not sure if there even is an example where this can truly lead to UB?
Thanks, this overall looks much better! |
This comment was marked as off-topic.
This comment was marked as off-topic.
//! like `Arc` and [`BorrowedFd<'a>`] is like `&'a Arc` (and similar for the Windows types). There | ||
//! is no equivalent to `Box` for file descriptors in the standard library (that would be a type | ||
//! that guarantees that the reference count is `1`). | ||
//! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! | |
//! | |
//! Note that functions like `pidfd_getfd` or debugging interfaces are out of scope. | |
//! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably say a bit about what that means, too, similar to https://doc.rust-lang.org/nightly/std/os/unix/io/index.html#procselfmem-and-similar-os-features? Basically we are saying Rust programs must not abuse these APIs, so "out of scope" is not a free pass to use them to circumvent everything. They can be used for debugging/tracing only.
Co-authored-by: Dan Gohman <dev@sunfishcode.online>
library/std/src/io/mod.rs
Outdated
//! clone and then you can no longer know what the reference count is! In that sense, [`OwnedFd`] is | ||
//! like `Arc` and [`BorrowedFd<'a>`] is like `&'a Arc` (and similar for the Windows types). There | ||
//! is no equivalent to `Box` for file descriptors in the standard library (that would be a type | ||
//! that guarantees that the reference count is `1`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding to the Arc
analogy, a sentence could be added to justify why it is unsound to use dup2
to replace a BorrowedFd
with a different file descriptor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talking about dup2
here is hard since that is so platform-specific. But really the bad thing dup2 does is close the other FD, so I think this would be an example explaining why you cannot close a BorrowedFd. I could add that (basically: given an &Arc
, you also don't get to decrement the refcount).
@Amanieu I see you removed the libs-api-nominated label... so what's the next step for this PR? |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (5a4e47e): comparison URL. Overall result: ❌ regressions - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 634.614s -> 633.83s (-0.12%) |
Fixes rust-lang/unsafe-code-guidelines#434
Cc #114167
Cc @Manishearth @sunfishcode @joshtriplett