Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about WasiDir #5025

Open
npmccallum opened this issue Oct 6, 2022 · 11 comments
Open

Questions about WasiDir #5025

npmccallum opened this issue Oct 6, 2022 · 11 comments

Comments

@npmccallum
Copy link
Member

There are a number of aspects to WasiDir that are unclear to me. For completeness, I'll paste the full definition here and put questions in comments.

#[wiggle::async_trait]
pub trait WasiDir: Send + Sync {
    /// Does path contain segments? Should we follow symlinks? Should we create intermediate dirs?
    async fn create_dir(&self, path: &str) -> Result<(), Error>;

    /// Does path contain segments? Should we follow symlinks?
    async fn symlink(&self, old_path: &str, new_path: &str) -> Result<(), Error>;

    /// Does path contain segments? Should we follow symlinks?
    async fn remove_dir(&self, path: &str) -> Result<(), Error>;
    
    /// Does path contain segments? Should we follow symlinks?
    async fn unlink_file(&self, path: &str) -> Result<(), Error>;
    
    /// Does path contain segments? Should we follow symlinks?
    async fn read_link(&self, path: &str) -> Result<PathBuf, Error>;

    /// Does path/dest_path contain segments? Should we follow symlinks?
    async fn rename(
        &self,
        path: &str,
        dest_dir: &dyn WasiDir,
        dest_path: &str,
    ) -> Result<(), Error>;

    /// Does path/target_path contain segments? Should we follow symlinks?
    async fn hard_link(
        &self,
        path: &str,
        target_dir: &dyn WasiDir,
        target_path: &str,
    ) -> Result<(), Error>;

    /// There does not appear to be any way to set the times on the directory itself. Is this by design?
    async fn set_times(
        &self,
        path: &str,
        atime: Option<SystemTimeSpec>,
        mtime: Option<SystemTimeSpec>,
        follow_symlinks: bool,
    ) -> Result<(), Error>;
}
@sunfishcode
Copy link
Member

All the path arguments here can contain multiple path components. Symlinks are are always followed in all path components other than the last component.

create_dir, remove_dir, symlink, unlink_file, read_link, and rename do not follow symlinks in the last component .

hard_link also does not follow symlinks. Its POSIX counterpart linkat does accept an AT_FOLLOW_SYMLINK flag, but that flag isn't supported on all popular platforms, so it isn't exposed here.

create_dir does not create intermediate directories.

One can set the times on the directory itself by calling set_times with a path of .. There isn't any fundamental reason why there isn't an equivalent of a futimens which works without a path in WasiDir; it just hasn't come up so far.

@npmccallum
Copy link
Member Author

@sunfishcode When are we expected to handle . and ..? This is very important to document because of the security implications.

@sunfishcode
Copy link
Member

WasiDir is just derived from wasi-filesystem semantics. . and .. are supported in all these functions, with the limitation that .. shouldn't be able to resolve to a directory outside the self directory. So foo/.. is ok (assuming foo is not a symlink to ..), but ../foo is not.

That said, wasi-filesystem itself doesn't document many of these details yet. I intend to add documentation, but haven't gotten to it yet. For now, I can answer questions, and if anyone would like to help with the documentation I can mentor.

@npmccallum
Copy link
Member Author

npmccallum commented Oct 6, 2022

We plan to submit a documentation PR.

That limitation makes sense. However, it should be noted that there is a difference between foo/.. and .; the first requires permissions on foo to evaluate and the second does not. Again, there are security implications here.

Consider the situation where we pass in two file descriptors to the wasi instance:

  • /
  • /foo/bar

Additionally, let's presume that / is empty.

What happens in the following scenarios from the perspective of the wasm application?

  1. open("/") - expectation: an fd pointing to '/'
  2. open("/..") - expectation: an error
  3. open("/foo/bar") - expectation: an fd pointing to '/foo/bar'
  4. open("/foo/bar/..") - expectation: an error

If my expectations are correct, then cases 2, 3 and 4 differ in surprising ways from all existing operating systems.

@bjorn3
Copy link
Contributor

bjorn3 commented Oct 6, 2022

open("/") - expectation: an fd pointing to '/'

That should be an error as opening / on any dir fd would open the root directory, rather than the dir that was passed in. Note that wasi doesn't have an open equivalent. Only an openat equivalent. So it would be openat(some_fd, "/") on your example.

@npmccallum
Copy link
Member Author

@bjorn3 wasi-libc synthesizes open() in the way I've described above.

@bjorn3
Copy link
Contributor

bjorn3 commented Oct 6, 2022

Right, I thought you were referring to the syscall level.

@bjorn3
Copy link
Contributor

bjorn3 commented Oct 6, 2022

  1. open("/foo/bar") - expectation: an fd pointing to '/foo/bar'

This matches unix systems. By having a pre-opened fd pointing to /foo/bar, you have effectively done a bind mount of /foo/bar to the directory of this fd. As for 2 and 4 I agree that it is surprising. 4 should be solvable by trying to resolve .. on the path string before doing any wasi call if wasi-libc doesn't already do this.

@npmccallum
Copy link
Member Author

@bjorn3 It doesn't match unix systems.

Passing a file descriptor, under the current implementation with wasi-libc roughly approximates the mount facility combined with filesystem namespaces. The difference is that on a unix system you can only mount a file system on a directory that already exists. Under wasi-libc, however, you can mount a directory anywhere in a theoretical hierarchy that doesn't actually exist.

Under wasi-libc if you did ls /foo you'd get an error but ls /foo/bar would succeed. This doesn't match unix behavior.

@sunfishcode
Copy link
Member

If my expectations are correct, then cases 2, 3 and 4 differ in surprising ways from all existing operating systems.

All of those expectations are currently met by wasi-libc's preopen system. There is some design tension between the goals of compatibility, the goals of leveraging WebAssembly's unique strengths, especially in light of the finite development resources for wasi-libc.

wasi-libc's underlying observation here is that even outside of WASI, applications generally can't assume that they know the relative positions in the filesystem namespace of the various paths you provide to them. Even if two paths have the same prefix, it's not safe for applications to notice this and switch to using .. paths to get from one to the other, because that might not follow the same symlinks as the original path provided from the user. Consequently, wasi libc can use a very simple system which takes a path, decides which preopened resource it belongs to, and then assume that the remainder of the path resolves within that resource.

If you have specific use cases which need to do this kind of hopping between different resources using .., or which need /.. to behave like it does in Unix, it'd be interesting to talk about them. If you're interested in writing libc code which implements a full mount tree, that'd be interesting to talk about too.

@npmccallum
Copy link
Member Author

npmccallum commented Oct 6, 2022

To be clear, I well understand that absolute compatibility with POSIX is not a design goal. I am, however, trying to understand where the surprises happen and why.

We do have the need to synthesize a wide variety of filesystem-like things. Keep in mind that our implementation cannot trust the host and therefore merely forwarding requests to the host isn't exactly an option. This means we have to implement things like filesystem encryption as a vfs.

We're trying to design that VFS layer now. And once we have something to show, I'd like to find ways to contribute all or parts of it to wasmtime since I think it is generally useful. More details to come.

I think our first step is gathering all this information and documenting the WasiDir and WasiFile interfaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants