-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pthreads with closure compiler #9569
Conversation
This reverts commit c702690.
I haven't been using pthreads with emscripten so far, only the old worker thread approach. Sorry :) |
I am not sure how to react to this PR. It is great to be Closure compatible, but being so in a way that requires exporting everything on |
@juj I agree that there is a code size downside to exporting on Module, and that generally we should avoid it as much as possible. I think it make sense here for two reasons. First, this code is highly complex, and getting more so: the key thing in our pthreads code is that it is invoked by
Exporting on Module makes all those cases much simpler and uniform, we just use the Module object to communicate between the different scopes. That is the normal meaning of Module, its the interface between the optimized code and the outside. The second reason I think exporting on Module makes sense here is that we only have a small number of very specific things that need that exporting. So the additional code size is quite small. Some numbers on hello world with pthreads and
So this adds only 77 bytes, while also making the code simpler + making it possible to run closure and decrease size by 50% :) Note that the simplification makes it possible to clean up some unnecessary things, which actually help code size. More is possible there but I didn't want to do too much in a single PR. |
.circleci/config.yml
Outdated
@@ -239,7 +239,8 @@ commands: | |||
- run: | |||
name: download chrome | |||
command: | | |||
wget -O ~/chrome.deb https://dl.google.com/linux/direct/google-chrome-beta_current_amd64.deb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required for this PR, or just because of unrelated CI issues at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, thanks, this was a cherry-pick of a temporary workaround. Will remove.
|
||
#if WASM | ||
// The Wasm module will have import fields for STACKTOP and STACK_MAX. At 'load' stage of Worker startup, we are just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of explanatory text we are removing. Is it completely outdated? If not, is there a better place to put it, or an equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly outdated, but yeah, some of the background might still be useful, I'll add it.
I think the simplifications are worth exporting a few more bytes. Especially if we can make further ones in the future. |
That extra complexity is no longer needed after recent simplifications. This ends up making the pthreads and non-pthreads paths more similar too. This saves 338 bytes, which more than makes up for the slight regression of 77 bytes from #9569.
…uced in emscripten-core#9569. That is same as JS library function establishStackSpaceInJsModule().
…uced in emscripten-core#9569. That is same as JS library function establishStackSpaceInJsModule().
* Remove preamble function Module['applyStackValues']() that was introduced in #9569. That is same as JS library function establishStackSpaceInJsModule(). * Fix establishStackSpace() for wasm backend
Pthreads get the current time from their parent thread during startup, so they can return results that make sense relative to it. Without that, the time on a pthread can be "earlier" than the time on the main thread, even if the pthread checks the time later. This broke in #9569 when we removed the magic globals from worker.js, as now such values must be passed explicitly. So that global just had the initial 0 value. The fix is to pass the value on Module and use it from there, safely. This adds a test. Timing tests are always risky, but this just checks monotonicity of the time, and looks 100% consistent when I run it locally, so hopefully it's ok. If we see flakes on CI we may need to remove it though. Fixes #9783 Co-authored-by: Alon Zakai <alonzakai@gmail.com>
Turns out much of the pthreads runtime was not closure-safe. This PR fixes that, using quoted strings as closure expects. While doing that I noticed that some of the things in place for pthreads + MODULARIZE would also help closure. In both cases the key thing is that worker.js, that loads the main JS file, is separate from it, so it needs proper interfaces and can't just rely on global variables (which in closure are minified in the main JS, and in MODULARIZE are in another scope). So I removed things like makeAsmExportAccessInPthread, makeAsmGlobalAccessInPthread, makeAsmExportAndGlobalAssignTargetInPthread which makes the code simpler. Because of that I was also able to remove a bunch of globals from worker.js which further simplifies things, and should also help node.js workers (where the global scope is very different, emscripten-core#6567). Most of this PR is straightforward according to the above notes. One slightly tricky thing is the stack setup, which I refactored to use a new Module.applyStackValues method that worker.js calls at the right time (as the stack is set up after the JS loads). Code size effects: Before this PR: 69,567 bytes. After this PR: 69,644. After this PR, plus closure: 34,909. So this adds only 77 bytes, while also making the code simpler + making it possible to run closure and decrease size by 50%. In addition, I have followup work to further reduce the non-closure code size too, so this regression will be fixed (and more).
That extra complexity is no longer needed after recent simplifications. This ends up making the pthreads and non-pthreads paths more similar too. This saves 338 bytes, which more than makes up for the slight regression of 77 bytes from emscripten-core#9569.
…e#10042) * Remove preamble function Module['applyStackValues']() that was introduced in emscripten-core#9569. That is same as JS library function establishStackSpaceInJsModule(). * Fix establishStackSpace() for wasm backend
Pthreads get the current time from their parent thread during startup, so they can return results that make sense relative to it. Without that, the time on a pthread can be "earlier" than the time on the main thread, even if the pthread checks the time later. This broke in emscripten-core#9569 when we removed the magic globals from worker.js, as now such values must be passed explicitly. So that global just had the initial 0 value. The fix is to pass the value on Module and use it from there, safely. This adds a test. Timing tests are always risky, but this just checks monotonicity of the time, and looks 100% consistent when I run it locally, so hopefully it's ok. If we see flakes on CI we may need to remove it though. Fixes emscripten-core#9783 Co-authored-by: Alon Zakai <alonzakai@gmail.com>
Turns out much of the pthreads runtime was not closure-safe. This PR fixes that, using quoted strings as closure expects.
While doing that I noticed that some of the things in place for pthreads + MODULARIZE would also help closure. In both cases the key thing is that
worker.js
, that loads the main JS file, is separate from it, so it needs proper interfaces and can't just rely on global variables (which in closure are minified in the main JS, and in MODULARIZE are in another scope). So I removed things likemakeAsmExportAccessInPthread, makeAsmGlobalAccessInPthread, makeAsmExportAndGlobalAssignTargetInPthread
which makes the code simpler.Because of that I was also able to remove a bunch of globals from
worker.js
which further simplifies things, and should also help node.js workers (where the global scope is very different, #6567).Most of this PR is straightforward according to the above notes. One slightly tricky thing is the stack setup, which I refactored to use a new
Module.applyStackValues
method thatworker.js
calls at the right time (as the stack is set up after the JS loads).cc @wffurr @VirtualTim