Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does I/O safety forbid duping arbitrary file descriptors? #114167

Closed
RalfJung opened this issue Jul 28, 2023 · 21 comments
Closed

Does I/O safety forbid duping arbitrary file descriptors? #114167

RalfJung opened this issue Jul 28, 2023 · 21 comments
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Jul 28, 2023

I/O safety is summarized as

a guarantee that if one part of a program holds a raw handle privately, other parts cannot access it

I would argue then that if a function is written foo(fd: OwnedFd), then it should be guaranteed that it is the only one with read/write access to the underlying resource managed by this FD. (Of course if it is a file, others might be able to open the file again, but not all FDs are for files that have a name in the file system.) And this guarantee almost holds, except for OwnedFd::try_clone and BorrowedFd::try_clone_to_owned. These will take a mere reference and return a fully owned OwnedFd (via dup), so our foo function has been foo'led and it is not actually exclusively reading/writing the underlying resource.

I am somewhat surprised to discover this now; these functions were not mentioned in the original RFC. They fundamentally change what it means to hold an OwnedFd, and I think not for the better. They are incompatible with the mental model I used when helping design and defend I/O safety.

So, what shall we do about this?

  • Document the actually remaining guarantees for OwnedFd carefully. They are very weak (basically just "nobody will close this FD while you have it").
  • Attempt to change the past and deprecate the safety of these methods.

Also, we should clarify what this means for I/O safety overall. Would it be sound to do dup(rand()) and return the result as an OwnedFd? I think the answer should be "absolutely not"; if I create an FD and don't share it with anyone I should be guaranteed that no duplicates exist. This would at least make it possible for a user crate to define NonDupOwnedFd/NonDupBorrowedFd types that guarantee an absence of duplicates.

Cc @sunfishcode

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 28, 2023
@RalfJung
Copy link
Member Author

This is @rust-lang/libs-api I guess (certainly the part about potentially deprecating safe try_clone), but note that some of this is an ecosystem-wide decision, so I'll also add the T-lang label.

@RalfJung RalfJung added T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jul 28, 2023
@sunfishcode
Copy link
Member

Even without try_clone in Rust's own API, this kind of exclusivity is undermined by the possibility that someone could call OwnedFd::from_raw_fd on a raw fd that has already been duped.

This might suggest adding a rule to from_raw_fd requiring that there exist no dups, however that's sometimes difficult to know. For example, if a process receives a fd from its parent when it is spawned, the parent could still be holding onto what is effectively a dup of the fd, and there's no way the child can tell.

Further, this kind of exclusivity is undermined by the fact that explicit dup isn't the only API in play. mmap performs what is effectively a dup, because it's guaranteed that you can mmap a fd and immediately close the fd, and the mapped memory can continue to access the data referenced by the fd. fork performs what is effectively a dup on all non-O_CLOEXEC fds into the child. sendmsg sending a fd to another process over a Unix-domain socket performs what is effectively a dup into the destination process.

And, on some platforms, including macOS, it's not even possible to set O_CLOEXEC atomically on all fds, meaning an ill-timed fork on another thread could implicitly dup fds before O_CLOEXEC can be set, and there's nothing that can be done to prevent this.

This is all in addition to the fact that files can be opened multiple times, potentially even by different names, from the same program or different programs.

There are so many ways to undermine this kind of exclusivity, that it would be tricky and limiting to maintain, even without try_clone in Rust's own API.

From another angle, it also happens that this kind of exclusivity is much less useful for fds than it is for memory. In Rust we're all accustomed to thinking about exclusive ownership. However, when working with fds, this kind of exclusivity is almost never what matters for avoiding UB. Among other things, OS's implicitly perform thread synchronization under the covers, so there's never a data race on a fd.

There are a few situations where this kind of exclusivity would matter. For example, if one does memfd_create and then mmap and then slice::from_raw_parts, one may be relying in keeping the fd encapsulated and not allowing anyone to write to it. However, in those cases, one usually shouldn't implement AsFd at all, since BorrowedFd as a type doesn't say that you can't write to it. And if a fd is held in a type that doesn't implement AsFd (or AsRawFd), I/O safety says that, with care, it can be fully encapsulated, so that no one can dup it without the type's knowledge.

Consequently, this kind of exclusivity guarantee at the OwnedFd level isn't that useful in practice.

Also, we should clarify what this means for I/O safety overall. Would it be sound to do dup(rand()) and return the result as an OwnedFd? I think the answer should be "absolutely not"; if I create an FD and don't share it with anyone I should be guaranteed that no duplicates exist. This would at least make it possible for a user crate to define NonDupOwnedFd/NonDupBorrowedFd types that guarantee an absence of duplicates.

I agree. dup under I/O safety conceptually has an argument with type BorrowedFd. If you're calling it in libc, where it's just a c_int, then when you type the unsafe { } needed to call it, your safety obligation is to provide a fd that could be passed to BorrowedFd::borrow_raw at that point. Consequently, dup(rand()) would violate I/O safety, because a random integer value isn't guaranteed to be the value of an open fd.

And yes, you can create NonDupOwnedFd/NonDupBorrowedFd if you encapsulate your fds and take appropriate care. To prevent them from being dupd, those types should not implement AsFd or AsRawFd.

@bjorn3
Copy link
Member

bjorn3 commented Jul 28, 2023

I think exclusivity over the FD itself is important so nobody else can close it and dup2 something else over it, but exclusivity over the underlying file description pointed to by the FD (which stays the same in case of dup) is not as important.

@sunfishcode
Copy link
Member

dup2's second argument is special and we have documentation about that very issue.

@RalfJung
Copy link
Member Author

Even without try_clone in Rust's own API, this kind of exclusivity is undermined by the possibility that someone could call OwnedFd::from_raw_fd on a raw fd that has already been duped.

Yes I would have expected that to be unsound.

Further, this kind of exclusivity is undermined by the fact that explicit dup isn't the only API in play. mmap performs what is effectively a dup, because it's guaranteed that you can mmap a fd and immediately close the fd, and the mapped memory can continue to access the data referenced by the fd. fork performs what is effectively a dup on all non-O_CLOEXEC fds into the child. sendmsg sending a fd to another process over a Unix-domain socket performs what is effectively a dup into the destination process.

Fork is terribly unsafe already anyway.

But the others are valid point... I guess you would want to still get "close-on-drop" behavior for them.

And yes, you can create NonDupOwnedFd/NonDupBorrowedFd if you encapsulate your fds and take appropriate care. To prevent them from being dupd, those types should not implement AsFd or AsRawFd.

Okay, so "exclusively owning a non-duped FD" is a thing we accept one can express, but we don't provide a type for that. I think I can live with that. It needs documentation though.

@joshtriplett
Copy link
Member

Confirming that @sunfishcode's description is what I'd always expected from I/O safety as well, since it was first proposed. The safety there is about not using closed FDs or random numbers as FDs, and about preventing double-close errors, not about preventing you from calling dup.

@workingjubilee workingjubilee added the A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` label Jul 29, 2023
@WaffleLapkin WaffleLapkin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 30, 2023
@RalfJung RalfJung changed the title I/O safety is partially undermined by OwnedFd::try_clone Does I/O safety forbid duping arbitrary file descriptors? Aug 13, 2023
@RalfJung
Copy link
Member Author

RalfJung commented Aug 14, 2023

I renamed the issue to reflect that indeed OwnedFd does not mean "this FD has no duplicates" -- but the question is whether it is possible for a user to define a type which means "this FD has no duplicates". The standard library never had a type like that and the RFC was rather vague on the topic so I don't think this is clear.

I'd very much like us to say that yes this is possible. However, I see two problems with this. The first is a spec/doc problem: we cannot just say "dup(rand()) is invalid", this needs to follow from some underlying principle which ensures that we coherently make all the right things invalid. After all, what we want is to give people proper "file description" encapsulation. Just saying "dup(rand()) is invalid" does not provide file description encapsulation, there are tons of things code could do that would violate encapsulation and we'd never be sure to actually list all of them. I think we need to introduce a notion of "file description ownership" as I sketched here.

The second problem is a practical one: cc::Build::new() (just to pick one example) writes to file descriptors that it obtains from an environment variable. If we have a program that relies on file description encapsulation for soundness, this means that I can cause undefined behavior by calling that program with an environment variable that just picks a random integer for the FD, if I later get unlucky and memfd_create picks exactly that number. Are we okay with that? @sunfishcode wrote about these APIs

Yes, this is a problem, but I'm working on a PR to fix it, by adding some wording to FromRawFd::from_raw_fd's safety condition that says that if you do eg. OwnedFd::from_raw_fd(some_fd_from_the_env_or_whatever), it's up to you to do global reasoning.

I don't see how that solves anything. How am I supposed to do that global reasoning? It's not possible for cc::Build::new() to actually ensure this is the case. I think if we want file description encapsulation we need to choose between these two options:

  • cc::Build::new() is unsound and should be marked unsafe, or
  • we are okay with saying that Rust code may have Undefined Behavior when an environment variable has the wrong value (basically, cc::Build::new discharges its safety obligation by delegating it to the environment that the program is started in, and also assumes that the environment is immutable or at least everyone calling set_var is required [in a it's-UB-if-you-don't way] to uphold all invariant associated with that variable)

The former is impractical, since it's not like the caller is in any better of a position than cc::Build::new itself to ensure that this FD is not used for encapsulation. And the latter is rather unsettling. @sunfishcode I wonder which of these you had in mind, or if you have some idea that could avoid this conundrum?

@the8472
Copy link
Member

the8472 commented Aug 14, 2023

How am I supposed to do that global reasoning? It's not possible for cc::Build::new() to actually ensure this is the case.

It's like building a safe API around something like ptr::from_exposed_addr(0x3FFC_0000) on an embedded system. Something tries to claim ownership of a wild resource and relies on nothing else in the program doing the same. That can only be achieved by either trusting your dependencies to be well-behaved or auditing everything.

I don't think the environment telling you the address fd number changes much, in the end you still have to assume that it was previously unclaimed and you can take ownership.

@RalfJung
Copy link
Member Author

Yes, that is a good analogy.

The main difference is that for the FD case, we don't have to accept that this may lead to UB. We have the alternative of saying that no code may rely on its file descriptor remaining encapsulated.

@sunfishcode
Copy link
Member

The former is impractical,

Agreed.

And the latter is rather unsettling.

Yes it is, but we already have a problem here, whether we understand it in these terms or not.

Even before we bring I/O safety into the picture, can we rigorously charactarize the behavior of cc::Build::new in the presence of hostile environment variables, using only things that cc::Build::new can be publicly assumed to do or not do? I don't think it's possible.

So I propose that we do accept that this may lead to UB.

@RalfJung
Copy link
Member Author

Even before we bring I/O safety into the picture, can we rigorously charactarize the behavior of cc::Build::new in the presence of hostile environment variables, using only things that cc::Build::new can be publicly assumed to do or not do? I don't think it's possible.

If we say that no code must rely on FD encapsulation, then we can at least rigorously characterize this case as "not UB". That's worth a lot -- it is all that the Rust type system ever guarantees.

Put differently, in such a world we could offer a safe function that turns any integer into BorrowedFd<'static>, and then cc could be written using that function.

@the8472
Copy link
Member

the8472 commented Aug 14, 2023

we don't have to accept that this may lead to UB

Well, that'd just be shifting the burden to any code that trusts the environment to not corrupt data that is temporarily moved from memory into IO. It doesn't even require mmap. Just writing a repr(C) struct into a pipe and assuming it'll arrive on the other side intact already requires this kind of exclusive ownership. A library that trusts the environment might check a header to make sure it's talking to the expected partner and after that ingest the data without further validation.

If we assume that any IO can get corrupted then all IO code must be written as if we'd be talking to an untrusted TCP stream. The necessary validation can come with significant overhead because it can require looking every single byte (e.g. validating all utf8-sequences) or checking more complex invariants.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 14, 2023

Parsing in Rust usually happens with crates that guarantee non-UB even for mal-formed inputs. If you trust your inputs that far that you are okay with causing UB, then clearly you should be using unsafe and this should be documented prominently. Without references I am not going to believe the claim that this is common practice in Rust codebases.

But someone using a library such as cc (which is widely used in build scripts) would never use unsafe and hence be probably completely unaware to what extent they are trusting their environment (and everyone who has access to set_var).

@the8472
Copy link
Member

the8472 commented Aug 14, 2023

Without references I am not going to believe the claim that this is common practice in Rust codebases.

Here we do some sanity check but don't revalidate all the bytes:

#[inline]
fn read_str(&mut self) -> &str {
let len = self.read_usize();
let bytes = self.read_raw_bytes(len + 1);
assert!(bytes[len] == STR_SENTINEL);
unsafe { std::str::from_utf8_unchecked(&bytes[..len]) }
}

ser_raw seems even less defensive. I've only glanced over the docs and looked at the types that implement it. It supports char, so there's validity that it just assumes to be upheld..

The general pattern here is that a process is trusting another instance of itself. It may do a version check to guard against format changes but after that the data isn't treated like coming from an external entity that could be malicious.

And then there's mmap. E.g. io_uring can communicate pointers through its shared memory map. If something can steal the uring file descriptor and establish another map then it can of course mess with the pointers you get back from the kernel. But normally you'd trust those values because you put them yourself in the queue in the first place.
And this isn't an aliasing issue. Even if you copied out the data from the memory map with atomics you'd still be getting bogus values.

@RalfJung
Copy link
Member Author

Here we do some sanity check but don't revalidate all the bytes:

Fair, so we even do this in rustc. I wasn't aware. Thanks for pointing this out.

And then there's mmap. E.g. io_uring can communicate pointers through its shared memory map. If something can steal the uring file descriptor and establish another map then it can of course mess with the pointers you get back from the kernel. But normally you'd trust those values because you put them yourself in the queue in the first place.
And this isn't an aliasing issue. Even if you copied out the data from the memory map with atomics you'd still be getting bogus values.

So I guess that's another way in which a program with a bad jobserver env var can go wrong, if cc then accidentally accesses a uring FD. (Though that seems unlikely to cause valid uring requests.)

What can I say, I am torn. I want to have my cake and eat it, too, but I cannot. ;) Deciding between "a bad jobserver env var can cause UB" and "code can rely on FD encapsulation for soundness" is not trivial.

@ChrisDenton
Copy link
Member

A jobserver seems to be a special case tho. An unfortunate one to be sure but if a choice has to be made then I think it's better to document OS specific weirdness than to make fds nearly impossible to reason about.

@the8472
Copy link
Member

the8472 commented Aug 14, 2023

Jobserver isn't that special. After all std claims FDs 0/1/2 by convention (instead of configuration). And systemd can pass additional descriptors in its socket activation protocol which is also configured through environment variables.

Though there's a safer way to use the jobserver protocol, I don't know why we're not using that by default.

@Manishearth
Copy link
Member

What can I say, I am torn. I want to have my cake and eat it, too, but I cannot

I still think that the right path forward that is invariant over cake fate is to declare these things outside of Rust's model: we attempt to handle file descriptor ownership (who can close an fd) but in general doing weird things to arbitrary fds is an OS problem and still a potential source of unsoundness, but not one Rust attempts to reason about.

@bjorn3
Copy link
Member

bjorn3 commented Aug 14, 2023

After all std claims FDs 0/1/2 by convention (instead of configuration).

The libstd initialization code ensures that 0/1/2 exist and if not opens them as /dev/null to ensure opening a file doesn't accidentally cause 0, 1 or 2 to be filled with something that expected exclusive ownership. In addition it doesn't allow configuration like jobserver does. It is always 0/1/2.

@the8472
Copy link
Member

the8472 commented Aug 14, 2023

I was just showing that inheriting and claiming resources from the environment is an existing pattern just like creating and maintaining exclusive access to resources is also a pattern and we'll have to deal with both of those uses.

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 22, 2023
add more explicit I/O safety documentation

Fixes rust-lang/unsafe-code-guidelines#434
Cc rust-lang#114167
Cc `@Manishearth` `@sunfishcode` `@joshtriplett`
@RalfJung
Copy link
Member Author

#114780 has landed, so the docs now state fairly clearly that it's not okay to dup (or write to, or do anything) with arbitrary file descriptors. This technically puts the jobserver crate in violation of I/O safety. But that should probably have a separate issue, so I'm closing this one in favor of a new issue with a more pointed description: #116059.

github-actions bot pushed a commit to rust-lang/miri that referenced this issue Sep 23, 2023
add more explicit I/O safety documentation

Fixes rust-lang/unsafe-code-guidelines#434
Cc rust-lang/rust#114167
Cc `@Manishearth` `@sunfishcode` `@joshtriplett`
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
add more explicit I/O safety documentation

Fixes rust-lang/unsafe-code-guidelines#434
Cc rust-lang/rust#114167
Cc `@Manishearth` `@sunfishcode` `@joshtriplett`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
add more explicit I/O safety documentation

Fixes rust-lang/unsafe-code-guidelines#434
Cc rust-lang/rust#114167
Cc `@Manishearth` `@sunfishcode` `@joshtriplett`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants