Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop probing for statx unless necessary #106661

Merged
merged 2 commits into from
Jan 15, 2023
Merged

Conversation

mjguzik
Copy link
Contributor

@mjguzik mjguzik commented Jan 10, 2023

As is the current toy program:
fn main() -> std::io::Result<()> {
use std::fs;

let metadata = fs::metadata("foo.txt")?;

assert!(!metadata.is_dir());
Ok(())

}

... observed under strace will issue:
[snip]
statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed.

Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat.

While here tidy up the commentary, in particular by denoting some problems with the current approach.

@rustbot
Copy link
Collaborator

rustbot commented Jan 10, 2023

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

Please see the contribution instructions for more information.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 10, 2023
@rustbot
Copy link
Collaborator

rustbot commented Jan 10, 2023

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-log-analyzer

This comment has been minimized.

@joboet
Copy link
Contributor

joboet commented Jan 11, 2023

This is failing CI because this miri test

// test `stat`
assert_eq!(fs::metadata("foo.txt").unwrap_err().kind(), ErrorKind::PermissionDenied);
// check that it is the right kind of `PermissionDenied`
assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES));
}

checks the last OS error instead of the actual error returned by the metadata call. I'm not a team member, but that test can probably be changed in this PR.

@mjguzik
Copy link
Contributor Author

mjguzik commented Jan 11, 2023

This is failing CI because this miri test

// test `stat`
assert_eq!(fs::metadata("foo.txt").unwrap_err().kind(), ErrorKind::PermissionDenied);
// check that it is the right kind of `PermissionDenied`
assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES));
}

checks the last OS error instead of the actual error returned by the metadata call. I'm not a team member, but that test can probably be changed in this PR.

thanks, I did not get an opportunity to look at the failing test yet.

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

I'm very new to rust :)

@joboet
Copy link
Contributor

joboet commented Jan 11, 2023

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

As per the documentation, errno values are not stable, so we do not need to restore it here, I think.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

The error is recoverable through raw_os_error, but that should be called on the error returned by fs::metadata instead of last_os_error. So the test should look like:

// test `stat`
let err = fs::metadata("foo.txt").unwrap_err();
assert_eq!(err.kind(), ErrorKind::PermissionDenied); 
// check that it is the right kind of `PermissionDenied`
assert_eq!(err.raw_os_error(), Some(libc::EACCES)); 

I'm very new to rust :)

Welcome and thank you for your contribution 😄!

@mjguzik
Copy link
Contributor Author

mjguzik commented Jan 11, 2023

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

As per the documentation, errno values are not stable, so we do not need to restore it here, I think.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

The error is recoverable through raw_os_error, but that should be called on the error returned by fs::metadata instead of last_os_error. So the test should look like:

// test `stat`
let err = fs::metadata("foo.txt").unwrap_err();
assert_eq!(err.kind(), ErrorKind::PermissionDenied); 
// check that it is the right kind of `PermissionDenied`
assert_eq!(err.raw_os_error(), Some(libc::EACCES)); 

Ok.

Since you effectively wrote the patch, can you submit it as a separate PR or send it somewhere so that I can apply it on my branch? Then I would I push both commits here while preserving your authorship of the change.

Alternatively if you don't care I can steal it and give you credit in the commit message.

I'm just trying to sort it out without stepping on any toes here.

@joboet
Copy link
Contributor

joboet commented Jan 11, 2023

Ok.

Since you effectively wrote the patch, can you submit it as a separate PR or send it somewhere so that I can apply it on my branch? Then I would I push both commits here while preserving your authorship of the change.

Alternatively if you don't care I can steal it and give you credit in the commit message.

I'm just trying to sort it out without stepping on any toes here.

Stealing it is fine, but thanks for asking 😉. Otherwise, you can use co-authored-by for attribution by adding this line to the commit message:

Co-authored-by: joboet <REDACTED>

The test relied on Error::last_os_error() coming from the stat call on
the passed file, but there is no guarantee this will be the case.

Instead extract errno from the error returned by the routine.

Patch de facto written by joboet.

Co-authored-by:	joboet <jonasboettiger@icloud.com>
As is the current toy program:
fn main() -> std::io::Result<()> {
    use std::fs;

    let metadata = fs::metadata("foo.txt")?;

    assert!(!metadata.is_dir());
    Ok(())
}

... observed under strace will issue:
[snip]
statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address)
statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0

While statx is not necessarily always present, checking for it can be
delayed to the first error condition. Said condition may very well never
happen, in which case the check got avoided altogether.

Note this is still suboptimal as there still will be programs issuing
it, but bulk of the problem is removed.

Tested by forbidding the syscall for the binary and observing it
correctly falls back to newfstatat.

While here tidy up the commentary, in particular by denoting some
problems with the current approach.
@rustbot
Copy link
Collaborator

rustbot commented Jan 11, 2023

The Miri subtree was changed

cc @rust-lang/miri

// check that it is the right kind of `PermissionDenied`
assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES));
assert_eq!(err.raw_os_error(), Some(libc::EACCES));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this change. However, the raw_os_error docs do indicate that this does return the error from the previous stdlib call, so isn't it strange that this changed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errno is changed to EFAULT because the statx state is checked after the initial call returned an error. Since EFAULT indicates that statx is available, the error returned is the one by the first statx call, which has a different error code (like EACCES).

@RalfJung
Copy link
Member

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

This check happens once during the entire runtime of the program. Why is that so bad?

(I'm not fundamentally opposed to this PR, but the description does not give sufficient motivation IMO. It explains why we can avoid the check for some cases, but does not even attempt to explain why we should.)

@mjguzik
Copy link
Contributor Author

mjguzik commented Jan 12, 2023

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

This check happens once during the entire runtime of the program. Why is that so bad?

(I'm not fundamentally opposed to this PR, but the description does not give sufficient motivation IMO. It explains why we can avoid the check for some cases, but does not even attempt to explain why we should.)

Quite frankly I thought this change would be a no-brainer -- for one syscalls are not exactly cheap these days on, most notably on amd64.

What perhaps I should have mentioned is that I intend to post another PR in the area, this time making the change to only do the probe if EPERM got received.

As indicated by comments I added in the code, stock routine has a bug which I temporarily kept and it boils down to considering the syscall unusable even when it in fact works. For example statx may transiently return ENOMEM due to memory pressure (statx -> getname_flags -> __getname ; see https://elixir.bootlin.com/linux/latest/source/fs/namei.c#L129) and with the current detection method (!= EFAULT) it will be incorrrectly considered not present.

I consider going from slurping all errors to just EPERM a separate logical step, if only for bisectability. I can stack it as another change in this PR if preferred.

@RalfJung
Copy link
Member

As indicated by comments I added in the code, stock routine has a bug

Yeah that's definitely a good catch!

for one syscalls are not exactly cheap these days on, most notably on amd64.

We're talking about a single syscall per program execution here, it is hard for me to imagine how that could be perf-relevant. OTOH the cost in extra code here is also small. So I was just wondering if there is something I missed here. Thanks for clarifying! Anyway I'm not a libs expert so we will see what the libs reviewer says.

@Mark-Simulacrum
Copy link
Member

@bors r+

I don't think the "performance" benefit is worthwhile enough by itself to motivate this, but adding the FIXMEs and enum seems like a good set of changes to me. Thanks!

@bors
Copy link
Contributor

bors commented Jan 14, 2023

📌 Commit b49aa8d has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 14, 2023
@raggi
Copy link
Contributor

raggi commented Jan 14, 2023

I don't think the "performance" benefit is worthwhile enough by itself to motivate this, but adding the FIXMEs and enum seems like a good set of changes to me. Thanks!

Consider perhaps if someone was writing stat(1), as in this group of people, who are writing stat(1): https://github.com/uutils/coreutils#utilities. In such a program (and many others there), the change may in fact be material.

@RalfJung
Copy link
Member

RalfJung commented Jan 14, 2023 via email

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 14, 2023
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#105526 (libcore: make result of iter::from_generator Clone)
 - rust-lang#106563 (Fix `unused_braces` on generic const expr macro call)
 - rust-lang#106661 (Stop probing for statx unless necessary)
 - rust-lang#106820 (Deprioritize fulfillment errors that come from expansions.)
 - rust-lang#106828 (rustdoc: remove `docblock` class from notable trait popover)
 - rust-lang#106849 (Allocate one less vec while parsing arrays)
 - rust-lang#106855 (rustdoc: few small cleanups)
 - rust-lang#106860 (Remove various double spaces in the libraries.)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4313471 into rust-lang:master Jan 15, 2023
@rustbot rustbot added this to the 1.68.0 milestone Jan 15, 2023
Comment on lines +175 to +177
if STATX_SAVED_STATE.load(Ordering::Relaxed) == STATX_STATE::Present as u8 {
return Some(Err(err));
}
Copy link
Member

@the8472 the8472 Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not quite correct to rely on this because syscall filters can be installed during program runtime. So the behavior can change. At that point ENOSYS could bubble up to user code because the flag now says all errors should be bubbled up.

If ENOSYS and EPERM can be reliably distinguished from other errors then updating the atomic every time such an error occurs is better, as we do for splice and sendfile
If EPERM can also convey a real error then the pre-probing behavior that existed before this PR is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a real issue here.

Detection only showed up because some deployments got rust utilizing statx running on kernels which did not have it yet or otherwise did not have configuration updated to cope with it, resulting in EPERM from seccomp.

To my understanding statx is the way to stat going forward and as such disabling it on purpose is a misconfiguration of the system, on par with disabling a syscall like open.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any sane seccomp filters are deny-by-default, which means any filter list that doesn't explicitly opt into statx can end up blocking the call.
Combine that with the fact that people run outdated software (which can contain equally outdated filter lists) it is possible that you end up with a situation where stat is allowed but statx isn't.

And what happens in the wild is not really relevant for correctness. For correctness it only matters that syscall availability can change at runtime.

People do all kinds of crazy stuff and we need to be defensive.

Copy link
Contributor Author

@mjguzik mjguzik Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case I have to stress that with and without the patch the rust binary will ultimately detect statx (un)availabilty the same way, whether the kernel does not support statx to begin with or seccomp is configured to disable it, as long as the latter happened prior to running detection code.

I have to concede it is possible someone has outdated seccomp filters, which happen to be disabled when a rust binary starts execing and only get enabled later. In such a case errors from seccomp will indeed be returned. But this was already the case with the code prior to me patching it (read: no regression on that front).

I also feel inclined to note the sendfile example does not "update the atomic every time". The code speculatively executes the syscall, presumably to avoid paying specifically for detection, and the atomic store is a one time ordeal per exec -- any new calls past that fail to get there. Well, one may nitpick a multithreaded program can have several stores, but the point stands.

If anything my patch moved statx handling closer to that spirit, but in contrast it still does not handle such a failure at an arbitrary moment.

Now, legally encountered permission problems return EACCESS. The kernel has some hairy code to support all of this and I definitely would not stack my neck on EPERM being only returned by seccomp.

All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs I mentioned in FIXME comments. If it does not get hairy, I may as well throw an extra check for seccomp if EPERM is seen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, totally agreed, it's an edge-case. But seccomp filters getting installed during the runtime of a program after doing some setup work is definitely a thing that happens in daemons that lock themselves down.

All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs

👍

@RalfJung
Copy link
Member

RalfJung commented Jan 15, 2023 via email

@the8472
Copy link
Member

the8472 commented Jan 15, 2023

It's ok to precheck as a performance optimization to avoid doing an unnecessary call. But it's still necessary to handle the error and a fallback on the we-think-it's-available path when statx becomes unavailable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants