-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort out host environment assumptions for wasm-unknown-unknown #16
Comments
cc @lukewagner, I bet you have thoughts. |
In a nutshell, the PR proposes to add a general "syscall" function which is assumed to always be imported by the wasm module. That's then leveraged to provide functionality like printing and getting the current time. |
I posted a summary comment which is probably a good way to jump in. |
@aturon Great job summarising that mega thread, here's a few more things related to specifically to this issue: In WebAssembly, all imports must be fulfilled by the host. It's not possible to have optional imports. If we assume that at some point in the future we will care about backwards compatibility, then we're only left with four choices:
For 1) some people complained about using numerical identifiers for system calls. I don't personally think this is an issue, but it's worth noting that you could store and export a mapping of indexes to names as part of the compiled wasm file if you really wanted. For 2) the downsides are self-evident. For 3) it places the burden of backwards compatibility on users: we will never be able to provide this generator ourselves, because this target is intended to work anywhere in any host language, and so we must carefully specify exactly what algorithm should be used to generate imports from a list of names. This generator will be complex to implement as we must use name mangling to encode the type signatures of functions that we import. Furthermore, dynamically generating imports may not be easy or even possible for all hosts. For 4) from the point of view of libstd itself, this is technically the same solution as 1). The benefit is that it provides a more normal interface to the host language, at the cost of increased work to maintain this additional "imports" crate. |
Another thing to add is that imports may be fulfilled either via the wasm imports section, or by linking to libraries which export those functions. This means that even if we add imports to libstd, it's possible to generate a wasm file without imports by implementing them in rust. |
It is already very difficult to create small |
@fitzgen if you really want to go bare metal, wouldn't it make more sense to go |
The very idea of an It belongs in another target like |
@rpjohnst I don't think it makes sense on any target to have an implementation of Either we should say that if you decide to use Personally I don't see much value in splitting out the two targets, since one is never going to support |
That's fine, call it
I'm talking about the former. If you use std on an This is how everything else works. If you try to build a When you treat that incomplete binary as a successful build, you are placing requirements on the environment (just like, say, an installed libc) and must use a target that expresses that.
I don't think this is quite enough. If you want to use std without the standard syscall interface, you should be able to use the This all ignores the portability lint, too. We definitely want to disable parts of std when running on |
That doesn't make sense: how can using
You keep bringing up What this does conflate is a pluggable libstd with no libstd. The rationale being that you can choose between them with the In case it's not clear: the whole point of the system call interface is that we don't provide the implementation. The implementation is provided for the specific host, or could even be implemented from rust. |
In one sense, std can do plenty without any system calls whatsoever- remember it can run whole test suites as long as you don't care about printing the results. In the sense I meant, you could have your own host interface (e.g. for fitting into some existing Javascript library) and you get std running the same way you get I guess you could say I'm arguing for these syscalls to be exposed in Rust for crates to fill in if necessary, rather than forcing them to be wasm imports.
You misunderstand me. The web is just an example. I'm not arguing for The reason I brought up the portability lint is that some platforms (including the
How could it be provided from Rust if the syscalls are all wasm imports? |
Ah, this discussion makes a lot more sense now: there's no difference between wasm imports and C/Rust imports! When you generate a wasm file, any unresolved extern symbols get added as imports - that means as long as you can satisfy any wasm import by statically linking a C or Rust file the exports the required symbol. |
That's insufficient. Perhaps another way to phrase this is that most platforms differentiate between static imports, which cause link errors if they're missing, and dynamic imports, which are resolved at runtime. WebAssembly only seems to have one kind of import, which out of necessity gets used as a dynamic import. When you ask So the scenario I'm describing is this: when you pick up a Rust program that uses std and build it for On the other hand, if you build that program for a new target like |
OK, let's assume that instead of requiring imports from the host, we require one or more symbols to be defined which libstd will import and use. Today, those two ideas have the same implementation: in the future (if your suggestion is implemented and we must be specific about wasm imports) then they may be different. Is your point simply that we should be explicit about them being C-style imports today rather than wasm imports? Whichever type of imports we define them to be, we still have to choose from the four options I mentioned above. |
Also, there's still no need to have a separate target: even if wasm imports become distinct from C-style imports and we use C-style imports in libstd, then anyone can publish a crate that re-exports the C-style imports as wasm imports. |
No, my point is about which situations generate those imports. The only way a binary should link successfully is when all its remaining imports are "dynamic"/wasm-style and either a) assumed to be provided by the environment because it's using a full target, or b) the crate has opted into it via an explicit declaration or dependency.
That re-exporting crate would be great to have, but it would need to be opt in. You couldn't just take any random binary crate (say, The reason for other targets is to let you run |
My inclination here (in both Rust and earlier in symmetric C++ discussions) is that practically nothing is elevated to a special builtin/syscall/runtime level: that anything that requires punching through to an embedding/Web API is expressed as pure Rust code that declares an import and calls it, and then how that import is satisfied is host-specific and happens outside of rustc. On the Web, the import would be satisfied with the export of an ES Module (with the 3 variants of where that ESM comes from enumerated in Lin's diagram). I could be wrong because of lack of Rust knowledge, but I think this matches what @rpjohnst is advocating as well? Considering a concrete example, printing to stdout, it seems like there would be two levels of crates here:
I don't know enough about Rust/crates to know how, but it seems like the user should be able to choose, for each top-level crate that gets built into wasm and packaged up into npm (again, diagram), which of the lower-level crates to use as the printing backend. Furthermore, I should be able to very easily write my own lower-level crate that sends stdout to who-knows-where in a few lines of Rust that call out to my custom JS and choose that one just as easily. I think this will be a pretty common thing to want to do on the Web for many of the areas of standard library functionality that, on other platforms, have a single obvious impl that we take for granted. Sorry if that was incoherent, happy to discuss more :) |
The implementation on |
I'm not sure what you're asking. Did you mean to write "the implementation on |
No, I mean |
On |
I feel like we're talking in circles: as far as libstd goes, that's no different from my original suggestion! The only change is to code generation, and is to make a distinction between wasm imports and C-style imports, which is an important consideration, but quite a separate concern from how libstd is implemented. You still have to solve the original problem with backwards compatibility that I asked in my first post: which of the four options are you going to go with:
|
I had a chat with @wycats today on this topic, which led to the following proposal as an alternative to today's syscall setup. trait WasmHost: Send + 'static {
fn write_stdout(data: &[u8]) -> io::Result<usize>;
// etc.
}
static WASM_HOST: RefCell<Option<Box<WasmHost>>> = RefCell::new(None);
// call this in the `start` function
fn set_wasm_host<H: WasmHost>(host: H) {
let old_host = mem::replace(WASM_HOST.borrow_mut(), Box::new(host));
assert!(old_host.is_none())
}
// Now the `std` implementation can use `WASM_HOST` to dispatch its functionality,
// using `unimplemented!()` as a fallback on `None` This would make it possible for external libraries to provide host bindings, with the caveat that the host must be manually initialized (probably within the While this isn't ideal in the long run, it would make it much easier to experiment out of tree, and would let us build up a more clear-cut picture of the interface between Personally, I'd prefer to go this route for now rather than, say, adding a separate target or otherwise try to nail down shared expectations around a built-in syscall interface. Notably, if you don't set the host, no JS imports are generated. Once we have more experience, we can later revisit the question of standardizing some interface here. wdyt? |
That seems like a good starting point for something we can experiment with now, out of tree. Nittiest of nitpicks: |
@aturon that sounds like a good starting point, but it doesn't (in itself) solve the backwards compatibility issues (ie. we will want to add/change methods on this trait, even after we stabilise it). There's a couple of ideas I have to solve this:
Another thing we should do regardless of which of these we choose, is to make a "no-op" implementation of these traits public, so that it can be deferred to for methods you don't want to implement. We can also use this no-op implementation instead of branching on an |
Indeed! That wasn't the goal so much as to:
I think stabilization is still a ways off; right now I'm just trying to make experimentation as easy as possible. We can't use a no-op implementation rather than |
I had started writing a comment yesterday, but threw it away when I considered the downsides, but I'll pull it back out again: It would be neat if instead of defining all syscalls on a trait, we made the trait return optional capabilities. Something like: trait WasmHost: Send + 'static {
fn stdout(&self) -> Option<&'static RefCell<Box<io::Write>>;
// etc.
} This would allow us to add new capabilities with default implementations that return The big downside, and why I binned the comment, is that this would presumably require re-writing a ton of |
I've mentioned it elsewhere (here and linked issues), but I'll note it on this issue too - xargo/cargo sysroots seem like a better solution to this problem than a constantly-changing struct with arguments (edit: meaning when people want to make a PR for syscall X on the trait but another group wants it implemented in another way) about what needs changing next. That said, I recognise that people are looking for a solution yesterday and sysroots are a way off. |
I believe we've done this over time by saying "the unknown-unknown target can make no assumptions", so I'm going to close this. |
At the moment, the wasm32-unknown-unknown target assuming nothing about its host environment. That means that
std
, as it stands, cannot even print to stdout.There's some ongoing debate on a Rust PR about whether and how to approach this issue. I wanted to open an issue here just to get more visibility -- but please comment on the linked PR.
The text was updated successfully, but these errors were encountered: