Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements Promise-based API for modularized builds #10697

Merged
merged 41 commits into from
May 8, 2020
Merged

Implements Promise-based API for modularized builds #10697

merged 41 commits into from
May 8, 2020

Conversation

lourd
Copy link
Contributor

@lourd lourd commented Mar 16, 2020

This removes the Module#then function and replaces it with a proper
Promise being returned from the initialization function.

Fixes #5820

As part of followup work I'll be updating the types repo as well, making a separate interface for the module factory function, which should make the EmscriptenModule interface simpler to use and extend.

emcc.py Outdated Show resolved Hide resolved
@lourd lourd changed the title Begins work on promise-based API when modularized Implements Promise-based API for modularized builds Mar 17, 2020
This removes the `Module#then` function and replaces it with a proper
Promise being returned from the initialization function.

Fixes #5820
@lourd
Copy link
Contributor Author

lourd commented Mar 18, 2020

The test failures are all related to threaded builds. In debugging the tests, it seems that some aspect of the changes are causing the threads to never stop. I'm seeing console warnings like:

Pthread 0x0 completed its pthread main entry point with an unwind, keeping the pthread worker alive for asynchronous operation.

I'm also unsure of how to deal with the MODULARIZE_INSTANCE case which hits this codepath

emscripten/emcc.py

Lines 3405 to 3412 in b4ec2f4

# Create the MODULARIZE_INSTANCE instance
# Note that we notice the global Module object, just like in normal
# non-MODULARIZE mode (while MODULARIZE has the user create the instances,
# and the user can decide whether to use Module there or something
# else etc.).
src = '''
var %(EXPORT_NAME)s = (%(src)s)(typeof %(EXPORT_NAME)s === 'object' ? %(EXPORT_NAME)s : {});
''' % {

which executes the exported function, changing what is later expected to be the WASM Module instance into a Promise.

I'm a little lost on all of the build options and the mechanics of WASM pthreads, so any help would be greatly appreciated.

Also removes exports_object reference from emcc.py. It's probably not
correct, but highlights that it's not being used if a promise is always
returned.
@kripken
Copy link
Member

kripken commented Mar 18, 2020

About those pthread errors, I would check to see if you also get them without this PR - it's possible those tests do actually exit their main with an unwind (as the error says) intentionally.

About MODULARIZE_INSTANCE, it does seem that if the modularize function returns a promise, then the singleton instance should be a promise. (These are changes we'll need to document carefully, but if everyone is on board, then sounds good to me, despite the risk of breakage for some users.)

@lourd
Copy link
Contributor Author

lourd commented Mar 18, 2020

Thanks @kripken, I'll check. Regardless of the presence of the warnings, the pthread-related tests are failing because this change now makes the workers/threads go into an infinite loop without reporting back results to the test server.

Understood about the MODULARIZE_INSTANCE impact. I'll make the changes to the code and docs accordingly.

@lourd
Copy link
Contributor Author

lourd commented Apr 15, 2020

Alright, I think I've handled all of the cases now, please re-review @kripken. The only tests I'm seeing failing on some targets in CI are a mismatch about the expected code sizes for test_minimal_runtime_code_size, and test_webidl. Those are passing for the heavy majority of test targets though, and I'm not able to reproduce either on my local machine. Please let me know if/how I should handle those.

The documentation will definitely need to be updated -- should I do that as part of this PR?

emcc.py Outdated
src = '''
var %(EXPORT_NAME)s = (%(src)s)(typeof %(EXPORT_NAME)s === 'object' ? %(EXPORT_NAME)s : {});
var %(EXPORT_NAME)s%(promise_variable_suffix)s = (%(src)s)(typeof %(EXPORT_NAME)s === 'object' ? %(EXPORT_NAME)s : {});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlighting that this introduces a new variable when using MODULARIZE_INSTANCE -- Module_promise, or {EXPORT_NAME}_promise. This was necessary because we don't want to overwrite the Module variable with a Promise, we only want to override it with the actual instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Module.promise, a property, instead of a completely new global?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This harkens back to the discussion from #5820 where we decided to go with a function that returns a Promise for modularized builds, rather than a making a new property like ready or promise on Module. I think adding a new property just in the MODULARIZE_INSTANCE case would be more confusing for users than adding a new global.

If you want to add a property in this case, I think we should go with that design for all cases, even non-modularized ones, removing onRuntimeInitialized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... Well, I admit I don't remember the full discussion there, but maybe let's go back to everyone there and make sure this is what people wanted?

To me it seems kind of awkward to have Module, Module_promise globals. We should really have just one global in modularize mode, as one goal is to minimize global stuff. And, Module_promise seems like a typo of Module.promise to me personally...

Copy link
Contributor Author

@lourd lourd Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on the awkwardness. I don't think it's that much more confusing than the existing behavior though -- that Module might be a factory function or an instance, depending on the build options. Or that Module is first used as the seed, and then gets overwritten as the instance. At least with this behavior, Module is just the seed. Since this isn't the default behavior, I think it'll be ok with clear docs.

To be clear Module_promise variable will only exist for MODULARIZE_INSTANCE, when EXPORT_ES6 isn't used. It will not exist for MODULARIZE. Do you have a sense of how much MODULARIZE_INSTANCE is used?

Copy link
Contributor

@curiousdannii curiousdannii Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terminology has always been awkward. There's always been two "modules" in MODULARIZE mode, which doesn't really modularize, it's a factory!

In regular MODULARIZE mode, you don't get the internal Module until the promise resolves, right? It should be the same for MODULARIZE_INSTANCE. So it can't be a property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see... are you saying that Module doesn't have a value until Module_promise resolves? So initially, in MODULARIZE_INSTANCE, Module is undefined while Module_promise is a promise?

How about if it does have an initial value of

var Module = {
  promise: ..the value of Module_promise..
};

and there is no Module_promise global? That seems backwards compatible + avoids a new global, but I may be missing something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Module only has a value if the user has defined it. If it exists it's used as the argument to the factory function, seen here. If the user never provides it, Module is never defined. The only global is Module_promise.

I don't think we could provide an initial value for Module given that behavior.

We could also change the suffix to Promise instead of _promise to keep with the camelCasing convention.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear on if that description is about how it's always been, or if it's a new limitation in this PR?

AFAIK, there wasn't such a problem that required a new Module_promise before, so I'd guess it's new? If so, why is it needed? Before we also had to be able to access something, and it worked somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a description of the existing behavior. It hasn't been highlighted before because the factory function executed synchronously, so the overwriting of any existing Module global variable wasn't as obvious, and accessing properties on it wasn't as difficult.

src/postamble.js Show resolved Hide resolved
@lourd
Copy link
Contributor Author

lourd commented May 8, 2020

Sounds good. I'm pretty experienced with the various JS module systems, and I feel confident saying this doesn't change anything regarding how users will import the generated code through any of them.

Without knowing more about Binaryen, it appears that may have been able to be resolved by using MINIMAL_RUNTIME, which prevents the export footer from being added to the code.

What else is needed to land this change?

@curiousdannii
Copy link
Contributor

curiousdannii commented May 8, 2020

Reminds me of a suggestion I made before, to deprecate EXPORT_ES6 and introduce EXPORT_METHOD with none, umd, and esm options (perhaps in the future moving to esm being the default).

Sorry if this has been covered before, but this PR only fixes promises for MODULARIZE mode right? Promises will need fixing for non-MODULARIZE mode too. We can do that as a followup PR, unless it would be simpler to do it as part of this one? (But probably not, this one has been a lot of work and deserves being merged as soon as it can be.)

@lourd
Copy link
Contributor Author

lourd commented May 8, 2020

All good ideas eventually see their day 😄

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this looks good, thanks! Just a few minor comments below.

And one last important thing is documentation. Please mention this in ChangeLog.md, hopefully with a quick explanation of what people need to change if they are using this option - if that's can't be concise enough, then a link to src/settings.js perhaps. And either way that file should be updated (including what changed etc.).

Also the website docs have 3 files mentioning MODULARIZE,

site/source/docs/tools_reference/emcc.rst
site/source/docs/porting/connecting_cpp_and_javascript/WebIDL-Binder.rst
site/source/docs/getting_started/FAQ.rst

Looks like the last two mention the Promise, so they may need updating (not sure about the first). In particular the FAQ part with Be careful with mentions the bug that this fixes, so that should be removed.

There is still a small code size increase here, but given the major issues this fixes, that pings have not gotten any worries reported, and that it actually decreases code size in the non-minimal case (showing it's a better design), I think this is worth it. If this is an issue we can consider other options later.

src/postamble.js Show resolved Hide resolved
@@ -427,9 +407,17 @@ function exit(status, implicit) {
// if exit() was called, we may warn the user if the runtime isn't actually being shut down
if (!implicit) {
#if EXIT_RUNTIME == 0
err('program exited (with status: ' + status + '), but EXIT_RUNTIME is not set, so halting execution but not exiting the runtime or preventing further async execution (build with EXIT_RUNTIME=1, if you want a true shutdown)');
var msg = 'program exited (with status: ' + status + '), but EXIT_RUNTIME is not set, so halting execution but not exiting the runtime or preventing further async execution (build with EXIT_RUNTIME=1, if you want a true shutdown)';
err(msg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the err needed with MODULARIZE? that is, could it be in an else of the ifdef? (same also below)

Copy link
Contributor Author

@lourd lourd May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. As far as I can tell, err normally just prints the string? Except here

err = function(text) { post('^err^'+(emrun_http_sequence_number++)+'^'+encodeURIComponent(text)); prevErr(text); };

So maybe err should be happening regardless of MODULARIZE?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen that emrun code before, but looks like it's doing something really weird there, which is not a problem for us.

So yes, err should just log to "stderr" basically. And so I think the current code will end up both printing the error and also rejecting the Promise, which as you said above would get printed out anyhow. I think it would be better to put it behind the if-else. But we can leave that for this PR I guess, I'd like to do a follow on this myself, and I can do it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

src/shell.js Outdated Show resolved Hide resolved
@lourd
Copy link
Contributor Author

lourd commented May 8, 2020

Alright, docs and changelog are updated.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Ok, I think this is all good except I don't see a settings.js doc change?

Also, about ChangeLog.md, I imagine a user might see that and not be sure how to check if they're doing something that will break with this change, or not break (and if so, what they need to do). And the link goes to this PR, which is far too long for that person to read.

As I suggested earlier, one option might be to put such a "if you have this, you must change it to this" explanation in the Changelog itself. If it's too big, then in settings.js and add a reference to see that example there.

@@ -427,9 +407,17 @@ function exit(status, implicit) {
// if exit() was called, we may warn the user if the runtime isn't actually being shut down
if (!implicit) {
#if EXIT_RUNTIME == 0
err('program exited (with status: ' + status + '), but EXIT_RUNTIME is not set, so halting execution but not exiting the runtime or preventing further async execution (build with EXIT_RUNTIME=1, if you want a true shutdown)');
var msg = 'program exited (with status: ' + status + '), but EXIT_RUNTIME is not set, so halting execution but not exiting the runtime or preventing further async execution (build with EXIT_RUNTIME=1, if you want a true shutdown)';
err(msg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen that emrun code before, but looks like it's doing something really weird there, which is not a problem for us.

So yes, err should just log to "stderr" basically. And so I think the current code will end up both printing the error and also rejecting the Promise, which as you said above would get printed out anyhow. I think it would be better to put it behind the if-else. But we can leave that for this PR I guess, I'd like to do a follow on this myself, and I can do it there.

@lourd
Copy link
Contributor Author

lourd commented May 8, 2020

Oops, forgot settings.js -- updated now and tweaked the changelog description.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks so much @lourd ! I know this was a long process, very grateful for your work and patience here.

Also thanks @curiousdannii and everyone else that commented here and in the issues before this!

@kripken kripken merged commit 423a77f into emscripten-core:master May 8, 2020
@curiousdannii
Copy link
Contributor

Really good docs in settings.js too, thanks!

@juj
Copy link
Collaborator

juj commented May 9, 2020

Wait.. what happens here? This PR regresses code size of MINIMAL_RUNTIME, and adds a new Module['ready'] core property to MINIMAL_RUNTIME that should have Module deprecated. Neither should be acceptable for MINIMAL_RUNTIME. I am sorry that I did not make time to review this, but I don't think this should have landed?

@kripken
Copy link
Member

kripken commented May 9, 2020

@juj I think you're already seen #11116 but that should remove the new overhead in the main case (I left it to a followup as this PR was already very long).

kripken added a commit that referenced this pull request May 12, 2020
Get MINIMAL_RUNTIME + MODULARIZE code size back to
where it was before that PR, when emitting HTML: the HTML in that
mode will create a single instance of the app and run it automatically.
It does so without needing the Promise, so we don't need to emit it.
Diffing the output JS to before #10697 shows essentially no difference.

Also move the changelog entry to the right place (we had a release
meanwhile), expand it a little, and ifdef some logging in postamble.js
(if we reject the Promise on an error, we log it out that way anyhow).
kripken added a commit that referenced this pull request May 13, 2020
…DULARIZE (#11118)

It used to be possible to do

Module().onRuntimeInitialized = ...

but with new MODULARIZE (#10697) Module() returns a Promise
instead. The Promise of course allows a property to be placed on it, but
nothing would happen, so it's potentially annoying for users. With
this change, when ASSERTIONS are on we will throw an error
with an explanation, something like

You are setting onRuntimeInitialized on the Promise object, instead
of the instance. Use .then() to get called back with the instance, see
the MODULARIZE docs in src/settings.js

Likewise, it used to be possible to do

var instance = Module();
// .. later ..
instance.exportedThing()

but again, the "instance" is a Promise now, and the exports aren't
there. This will crash, but ".. is not a function" is not that helpful.
With this change it will give an explanation of what's wrong.
sandersn pushed a commit to DefinitelyTyped/DefinitelyTyped that referenced this pull request May 15, 2020
* Updates Emscripten types with new `MODULARIZE` API

This corresponds with the changes to Emscripten in emscripten-core/emscripten#10697,
creating a type for the new Promise-based API of the factory function
generated when using `MODULARIZE`.

Removes the obsolete `then` method from EmscriptenModule and separates
the type for the factory function generated when using the `MODULARIZE`
build option from the rest of the `EmscriptenModule` interface. This
will make it easier to extend the interface.

Also removes the global `Module` variable declaration as this is not
applicable for every application. This was not ideal as it left a very
generic variable name in the global scope for any application that used
Emscripten, regardless of whether there was actually a global variable.
This now leaves it up to the user to declare their module instance as
appropriate for them.

Fixes the reference in sql.js, using the EmscriptenModule interface
directly.

* Runs prettier on the emscripten types
jjballano-qatium pushed a commit to jjballano-qatium/DefinitelyTyped that referenced this pull request Jun 16, 2020
* Updates Emscripten types with new `MODULARIZE` API

This corresponds with the changes to Emscripten in emscripten-core/emscripten#10697,
creating a type for the new Promise-based API of the factory function
generated when using `MODULARIZE`.

Removes the obsolete `then` method from EmscriptenModule and separates
the type for the factory function generated when using the `MODULARIZE`
build option from the rest of the `EmscriptenModule` interface. This
will make it easier to extend the interface.

Also removes the global `Module` variable declaration as this is not
applicable for every application. This was not ideal as it left a very
generic variable name in the global scope for any application that used
Emscripten, regardless of whether there was actually a global variable.
This now leaves it up to the user to declare their module instance as
appropriate for them.

Fixes the reference in sql.js, using the EmscriptenModule interface
directly.

* Runs prettier on the emscripten types
@curiousdannii curiousdannii mentioned this pull request Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undocumented Module.then causes infinite loop
4 participants