Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize file/directory symlinks in path_symlinks on Windows #1359

Merged
merged 2 commits into from
Apr 1, 2020

Conversation

marmistrz
Copy link
Contributor

Based from my experience, symlink_file will not return an error if we try to create a file symlink to a directory. I used the following Rust program to verify it:

use std::env;
use std::io::Result;
use std::os::windows::fs::symlink_file;

pub(crate) fn path_symlink(src: &str, dst: &str) -> Result<()> {
    // try creating a file symlink
    symlink_file(src, dst).map_err(|e| {
        println!("ERROR: {}", e);
        e
    })
}

fn main() {
    let args: Vec<_> = env::args().collect();
    path_symlink(&args[1], &args[2]).expect("path_symlink failed");
}

If dir is an existing directory, and the binary as executed with cargo run dir target, the program will succeed and create a file symlink.

For details on file vs. directory symlinks on Windows see #1358.

@kubkon: I'm also unsure about the correctness of the following error mappings:
https://github.com/marmistrz/wasmtime/blob/1c61210025c00f0d397be86d752717ab20642f53/crates/wasi-common/src/sys/windows/hostcalls_impl/fs.rs#L527-L536
In which cases are they expected to be encountered? What's the scenario they're there for?

@github-actions github-actions bot added the wasi Issues pertaining to WASI label Mar 18, 2020
@marmistrz
Copy link
Contributor Author

marmistrz commented Mar 19, 2020

Do you have any idea if we can add any regression test here? We'd probably need to invoke some low-level Windows API, I think.

@marmistrz
Copy link
Contributor Author

@sunfishcode @kubkon ping, can you please review the change?

Comment on lines +431 to +435
let res = if use_dir_symlink {
symlink_dir(&old_path, &new_path)
} else {
symlink_file(&old_path, &new_path)
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm somewhat confused here. Why do we need this check? Or put it another way, if we're doing this check already here, is creating a dir symlink on error ERROR_NOT_A_REPARSE_POINT still needed? Is the current approach of try and fail and try something else not working?

Some context on why we've originally had symlink_dir call after a failing symlink_file call. As @sunfishcode explained a long time ago, from atomicity point of view, it seems virtually always better to try and fail and then correct, rather than invoke another syscall (in this case via fs::metadata) which would check what the resource (in this case the symlink) is since in between the check and actual call to symlink_file etc. the path might change, or whatnot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unable to encounter ERROR_NOT_A_REPARSE_POINT while using symlink_file with the symlink target being a directory. As far as I have tested, the current approach of try and fail doesn't work and a file symlink will always be created.

I agree that try and fail would be preferable, but it appears that CreateSymbolicLinkA doesn't check if the symlink target is a file or directory. I don't know if ERROR_NOT_A_REPARSE_POINT may be returned by CreateSymbolicLinkA at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, sounds like a bug then. I thought that ERROR_NOT_A_REPARSE_POINT played some role in this, but alas I cannot remember the exact test conditions now. Anyhow, I thought it was covered by our test programs, but from what I understand, you've proved that they were not properly testing this behaviour on Windows? Would the test work fine if we got rid ERROR_NOT_A_REPARSE_POINT match altogether? And would they still work without both the match and your upfront check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and here's another thought. What would happen if we reversed the order of ops, i.e., started with symlink_dir, and if failed, tried symlink_file. Would that make it possible to apply the "try-catch" approach rather than an upfront check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to create a symlink to a file using symlink_dir succeeds too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, but then the symlinks are obviously invalid if the user would try to act on them in say Windows Explorer or whatnot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Windows Explorer will show an error, claiming that the directory is invalid.

@sunfishcode
Copy link
Member

The patch looks reasonable within the current WASI API.

What should we do in WASI itself here? One option I'm thinking about is to replace WASI's symlink function with symlink_file and symlink_directory, and moving the logic for deciding which one to use into WASI libc. That way, POSIX applications would get the same behavior, but it would allow other users to avoid "automatically pick which one or guess" if they wanted to. Thoughts?

@kubkon
Copy link
Member

kubkon commented Mar 25, 2020

The patch looks reasonable within the current WASI API.

What should we do in WASI itself here? One option I'm thinking about is to replace WASI's symlink function with symlink_file and symlink_directory, and moving the logic for deciding which one to use into WASI libc. That way, POSIX applications would get the same behavior, but it would allow other users to avoid "automatically pick which one or guess" if they wanted to. Thoughts?

FWIW, I think that this is the best way to go 👍

@github-actions
Copy link

Subscribe to Label Action

This issue or pull request has been labeled: "wasi"

Users Subscribed to "wasi"

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@marmistrz
Copy link
Contributor Author

I'm not quite sure if it's a good idea.

Such API change will make it non-trivial to compile POSIX programs to WASM and run them on Windows. Suppose with have a program written for Linux using symlink(2). Should such call be translated to symlink_dir or symlink_file? I think that a WASM counterpart of this shell script should continue to work under WASM on Windows:

touch foo
mkdir bar
ln -s bar baz # directory symlink should be inferred
ln -s foo fuz # file symlink should be inferred

Instead, we could allow an optional hint argument to path_symlink, for instance a: 0x0 - no hint, 0x1 - file, 0x2 - directory.
It's not clear what we should do if a dangling symlink without a hint is requested. We could either (a) fall back to file symlinks (b) check if the target exists and return EINVAL if it doesn't.

@kubkon
Copy link
Member

kubkon commented Mar 25, 2020

I'm not quite sure if it's a good idea.

Such API change will make it non-trivial to compile POSIX programs to WASM and run them on Windows. Suppose with have a program written for Linux using symlink(2). Should such call be translated to symlink_dir or symlink_file? I think that a WASM counterpart of this shell script should continue to work under WASM on Windows:

touch foo
mkdir bar
ln -s bar baz # directory symlink should be inferred
ln -s foo fuz # file symlink should be inferred

Hmm, I don't think that's what @sunfishcode meant. While WASI embedder would have to provide two syscalls namely symlink_file and symlink_dir, WASI libc would offer symlink(2) which would at the libc level work out which syscall to use.

@marmistrz
Copy link
Contributor Author

Hmm, I don't think that's what @sunfishcode meant. While WASI embedder would have to provide two syscalls namely symlink_file and symlink_dir, WASI libc would offer symlink(2) which would at the libc level work out which syscall to use.

Isn't wasi-libc eventually compiled to WebAssembly and doesn't it eventually become a part of the resulting wasm binary? With this solution we'd need different syscall selection logic depending on the OS used by the host executing the wasm binary.

@kubkon
Copy link
Member

kubkon commented Mar 25, 2020

Hmm, I don't think that's what @sunfishcode meant. While WASI embedder would have to provide two syscalls namely symlink_file and symlink_dir, WASI libc would offer symlink(2) which would at the libc level work out which syscall to use.

Isn't wasi-libc eventually compiled to WebAssembly and doesn't it eventually become a part of the resulting wasm binary? With this solution we'd need different syscall selection logic depending on the OS used by the host executing the wasm binary.

It is, but why would you need different logic at the WASI libc level? I assumed that the libc symlink(2) would first try delegating to symlink_file syscall implemented by the embedder, and if that fails, try symlink_dir, and if that fails, error out. Then, the host-dependent logic would be entirely encapsulated within the syscalls themselves. libc would then only rely on the return values to work out whether the call succeeded or not.

@marmistrz
Copy link
Contributor Author

It is, but why would you need different logic at the WASI libc level? I assumed that the libc symlink(2) would first try delegating to symlink_file syscall implemented by the embedder, and if that fails, try symlink_dir, and if that fails, error out.

Why should symlink_file fail if the target is a directory? I have never managed to reproduce such behavior on Windows. I assume that the libc implementation will need to check the file type on Windows, as implemented in this patch.

@kubkon
Copy link
Member

kubkon commented Mar 25, 2020

It is, but why would you need different logic at the WASI libc level? I assumed that the libc symlink(2) would first try delegating to symlink_file syscall implemented by the embedder, and if that fails, try symlink_dir, and if that fails, error out.

Why should symlink_file fail if the target is a directory? I have never managed to reproduce such behavior on Windows. I assume that the libc implementation will need to check the file type on Windows, as implemented in this patch.

Oh sorry, what I meant is that both symlink_file and symlink_dir would perform whatever checks are required to determine if creating this type of symlink on the host is valid or not for the given entities. In essence, the logic of path_symlink in this PR would be preserved but split into two separate calls. This way, libc's symlink would simpy check the return values to work out if it succeeded or not, and would be host-independent.

@marmistrz
Copy link
Contributor Author

Hmmm, but then, shouldn't we require the implementation of symlink_file on Linux to check if we actually have a file, so that the use of symlink_file is what intuition suggests?

@marmistrz
Copy link
Contributor Author

My suggestion is that we:

  • remove the possibly unnecessary checks (as done in 0d3a173), since we don't see any cases which they account for
  • merge the change as-is and create a separate issue in WebAssembly/WASI to discuss possible API changes (unless there are some changes in the code to be addressed)

@kubkon
Copy link
Member

kubkon commented Apr 1, 2020

My suggestion is that we:

* remove the possibly unnecessary checks (as done in [0d3a173](https://github.com/bytecodealliance/wasmtime/commit/0d3a17331f8b98c026805003a4973ccbf3b89696)), since we don't see any cases which they account for

* merge the change as-is and create a separate issue in WebAssembly/WASI to discuss possible API changes (unless there are some changes in the code to be addressed)

Sounds good!

@kubkon kubkon merged commit 20e7185 into bytecodealliance:master Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasi Issues pertaining to WASI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants