-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose memory
from generated JavaScript
#2102
Comments
I think the current behavior is good, because the first time it's instantiated it will be the "main thread", whereas the other instantiations are the "threads". If you send the module + memory to the other workers, you can get a race condition where sometimes the worker will finish first (and thus it will be the "main thread"), and sometimes it'll finish after (and thus it will be a "thread"). So even if you were able to spawn everything in parallel, you'd still have to guarantee that the main thread is instantiated and run first (before instantiating any of the workers). And this does actually matter, because it affects the behavior of Incidentally, this behavior also matches real OS processes, where you run your main process and it then spawns sub-processes (or threads). |
Huh, wait, is main thread determined purely by order of instantiation?
Sure, but OS threads have different costs of instantiation. On the Web, being able to pre-instantiate Wasm worker pool without actually running any useful code could be more benefitial, and can still be made transparently to the user. |
Let's say that hypothetically it was determined by a flag or something, It's pretty much unavoidable that the main thread must be initialized first:
The only exception to this is if you're running the Wasm entirely in workers (no main thread at all), in which case it would be possible to initialize all the workers in parallel.
This can be done today, without any changes to wasm-bindgen. Note that when you use |
Which code are you referring to? AFAIK most of Wasm out there doesn't use As long as you're only instantiating modules in parallel, and not invoking that function yet, any code wouldn't execute yet and no UB would occur.
Unless I'm missing something new about the mechanism you're proposing, that's how existing examples (including the raytrace-parallel mentioned in the description) already work. What I'm proposing is building on top of that and providing further optimisation by ensuring that the JS (Wasm) engine can actually instantiate the module as soon as possible and not wait for the main thread to finish initialisation. |
Note that this also doesn't requite any substantial changes to wasm-bindgen or how it operates; all I'm asking for is to |
The It has nothing to do with the
No, because the raytrace example only spawns the worker pool after the main thread is instantiated, whereas the strategy I posted allows you to spawn the worker pool in parallel with the instantiation.
As I explained, that's only possible if all of your Wasm is running in workers (no main thread).
It has nothing to do with how substantial the changes are, what you are asking for is just not compatible with how the main thread works. The only way your proposal could work is if we added a new |
Well, it's up to me as an app developer not to do that. I feel you're conflating module instantiation in JS API sense (literally just I want to parallelise the former, which wouldn't cause any of the mentioned issues (as long as The JavaScript, and, correspondingly, Rust code would still execute in the same order as it does now and thus such change would be completely unobservable, while improving startup time for additional threads. |
To be clear, the In addition, you seem to have ignored what I showed earlier:
If there is a race condition, the main thread will throw an exception, because the main thread cannot block. This happens even if you don't call any of the Rust functions. It is simply not possible to initialize the main thread and worker threads in parallel, because of these race conditions.
No, I have always been referring to WebAssembly instantiation. Even just calling |
Note that Overall, I'm somewhat surprised by this attitude and responses so far to be honest :/ I'm fully in control of my app and I know what I'm doing there. All I'm asking is to expose an already existing primitive that would help me and possibly other use-cases. Moreover, as I said, doesn't require any changes to how wasm-bindgen works and requires explicit usage / opt-in by developer (sort of like Rust's opt-in). I've described my motivation in the issue description just to provide one of use-cases where it was handy, but instead we're now discussing particular app architecture and WebAssembly in general in length. I'm always happy to do that sometime at a conference afterparty (if we still get those someday 😅), but it doesn't seem right for a Github issue. Note that if this primitive is not provided, it's not that the usages will stop, it's just that now developers will have to resort to guessing or hardcoding the limits when manually constructing In the future situation might change when js-types API is widely implemented (for now we have it only under a flag in V8), and users will be able to access any level of detailed information about module imports/exports directly, but meanwhile exposing it from wasm-bindgen would make task much easier and more robust than any hardcoded numbers. |
Yes, there will only be one thread that initializes the memory, but that's not the problem.
And my point is that it's impossible to avoid race conditions and use it correctly. It doesn't matter how much control you have, or how skilled you are, it is impossible to avoid the race condition.
Of course I would have no problem with exposing the shared memory... if it actually worked. But it doesn't.
I don't know why you got that impression, everything I have said is about this specific proposal and why it doesn't work.
If somebody does that, they will have the exact same race condition, and they will get errors. You seem to be misunderstanding how this all works. So let's suppose we made the changes you want, here is what would happen:
You'll note that this situation will happen even if you don't call any of the exported functions, because the problem is caused by It is simply not safe to instantiate workers in parallel with the main thread, period. You will cause a race condition, and it will end up sporadically causing errors. No matter what you do, you cannot stop this. If you instantiate in parallel and haven't run into any errors yet, that's just pure luck, because this race condition always exists. The only solution is to instantiate the module on the main thread first, before instantiating it on the workers. Any other solution will require changes in LLVM, because Rust uses LLVM for its multi-threading support, and it's LLVM that defines the |
It still sounds more like discussing the specific usecase I provided, rather than proposal in general (exposing memory). But okay, reasonable enough, thanks for bearing with me and your explanations. I just didn't expect the dicussion to go that deep :) Let's backtrack a bit and look at a simpler usecase / architecture that hopefully won't be controversial. I'll take yours as a base:
Let's say I want to perform step (3) here. How do I do that? Well, turns out, the only way is, again, to do this only from the WebAssembly side because it's the only one that has bindings to memory, even though actual memory lives on JS side. So if I want to This is quite a bit of unnecessary indirection to return JS value to JS code. Moreover, in this case we do have to execute not only |
Because wasm-bindgen doesn't expose the memory before instantiation, this forces you to instantiate the module before instantiating it on the workers. The only reason to expose the memory is in order to instantiate it in parallel with the workers, which is not safe. So I think it is relevant to the proposal in general.
I don't think it's correct to think of it as "JS side" or "WebAssembly side". The memory is created and owned by wasm-bindgen. Whether it's a part of the "JS side" or "WebAssembly side" is just an implementation detail which can change. And in the future it likely will change.
You can actually access the memory by accessing the const wasm = await init();
const memory = wasm.__wbindgen_export_0; That's an implementation detail though, so I really don't recommend doing that. Instead the best option is indeed to just create an export in Rust: #[wasm_bindgen]
pub fn shared_memory() -> JsValue {
wasm_bindgen::memory()
} If you're suggesting that wasm-bindgen should make it easier to access the memory from an instance without needing a Rust export, I agree. However we'll need to be careful about name collisions. Right now wasm-bindgen is actually quite bad about name collisions (in general). I have some ideas about that, but it would be a breaking change.
I suppose, though it's really just a couple nanoseconds, the overhead of passing is extremely small.
I'm not sure what JS / Rust initialization you're referring to: the JS glue code just calls It really doesn't do anything (unless you use |
Yeah, that's what I'm suggesting. I think same would apply to few other things (e.g. module or some implicit imports), but these are out of scope of this issue.
Huh, I thought in case of imported memory it's not exported. This could help, but yeah, I'm not comfortable relying on it in any way.
I wonder if simply changing the export name to That's the export name LLVM already uses by default (any non-threaded outputs), so presumably there is very low risk of it conflicting with anything else. |
Okay, though that won't fix your original request.
It already is called So yeah, hardcoding it as |
Sorry I've only skimmed the discussion here a bit, but the original issue should be solved by importing the Other than that I think @Pauan already covered this but you probably don't want to instantiate modules in parallel across workers and the main thread because the main thread cannot block. In general though I don't think this is necessarily something that wasm-bindgen itself would handle (if you'd like to do this by instantiating in parallel across only workers). It's pretty easy for a wasm tool to switch the memory export to a memory import and then you'd know the limits to create in JS and then pass as imports everywhere. |
This would be sufficient, but it seems like right now it's renamed to |
Ah ok makes sense! That sounds like a bug in one of the possible passes along the way, likely something that's de-exporting it and then re-exporting it later, forgetting the original export name along the way. |
@Pauan I didn't immediately realised what you meant by this. It gets me much closer, but yeah, doesn't resolve the original request. To recheck my understanding on previous comments - are you saying that using memory in the way I want to would be fine in the following scenario:
? If so, then it's perfectly sensible for me (I'm already Worker for the "main" thread for other reasons), but yeah, still requires exposing Wasm memory limits from JS code somehow. |
Yes, at least for now. In the future there will be other differences between the main thread and workers, but for now it should be safe to run only on workers[1]. That's why earlier I suggested a
|
Sounds good; do you think it's worth closing this issue in favour of two separate issues for |
That sounds good to me. |
@Pauan I've submited an issue for the memory export above, but I wonder if |
Motivation
Currently, Wasm memory is accessible only from inside the Wasm module (Rust code) via bindings but not exposed to JavaScript.
This means that examples like raytrace-parallel have to wait for instantiation to finish and for corresponding code to be reached, before sending the module + memory pair to another Worker.
Module is the easy bit, as it can be constructed by the user via manual
WebAssembly.compileStreaming
API, but constructing the memory requires knowing the limits defined by the module.At the same time, memory object as generated by wasm-bindgen for the threaded case, doesn't require full module instantiation, and so exposing it from JS could allow application to instantiate Wasm over the same SAB in all modules in parallel, without waiting for the main thread to finish.
Proposed Solution
Export
memory
factory function from the generated JavaScript code, so that user would be able to construct and pass it manually via existing 2nd param toinit
.The text was updated successfully, but these errors were encountered: