Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reboot EVAL_CTORS with the new wasm-ctor-eval #16011

Merged
merged 62 commits into from
Jan 14, 2022
Merged

Reboot EVAL_CTORS with the new wasm-ctor-eval #16011

merged 62 commits into from
Jan 14, 2022

Conversation

kripken
Copy link
Member

@kripken kripken commented Jan 13, 2022

This updates us to use Binaryen's new version of wasm-ctor-eval, which can now
do a lot more things, like eval just part of a function, eval to globals, etc. That plus
other changes on the emscripten side that move more things like sbrk into pure
wasm means that we can eval a lot more code.

Previously -Oz would enable EVAL_CTORS. That was pretty dangerous as
often it does not help code size. You really just need to run with the option and
then measure the code size change vs the startup speed improvement. So this
PR makes us no longer do anything automatically - you must manually build with
-s EVAL_CTORS.

A new mode EVAL_CTORS=2 is also added. This enables wasm-ctor-eval's
new --ignore-external-input flag, which ignores the environment, params to
main, etc. This is unsafe, and probably we should have separate options for
these things, but for now this seems useful for experimentation.

Tested by running all of wasm2 with EVAL_CTORS=2 enabled, and then
ignoring the failures that are expected (things reading from argv, for example).
Also I ran around 200,000 fuzzer iterations on binaryen.

Example results on ./emcc tests/hello_libcxx.cpp -O3:

mode wasm size(bytes)
normal 136625
EVAL_CTORS=1 136616
EVAL_CTORS=2 133059

The output on the last one is:

trying to eval __wasm_call_ctors
  ...success on __wasm_call_ctors.
trying to eval main
  ...partial evalling successful, but stopping since could not eval: call import: wasi_snapshot_preview1.fd_write
  ...stopping

It completely evals the ctors, and in main it evals some stuff, until it reaches
a call to print to stdout.

Fixes #15402

@kripken kripken requested a review from sbc100 January 13, 2022 20:51
@sbc100
Copy link
Collaborator

sbc100 commented Jan 13, 2022

Can you explain a little more about how that safe version (EVAL_CTORS=1) might not be desirable by default at some opt levels? How could it regress anything? I guess its could increase the size of your static data segment? Is that the only possible downside?

emcc.py Outdated
elif settings.ASYNCIFY:
# In Asyncify exports can be called more than once, and this seems to not
# work properly yet (see test_emscripten_scan_registers).
diagnostics.warning('emcc', 'disabling EVAL_CTORS due to asyncify')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I all these cases I think its better to just error our given contradictory settings. That is what we normally do.

@kripken
Copy link
Member Author

kripken commented Jan 13, 2022

Yes, that's right (unchanged from before this PR, but we didn't handle it well). It is just very possible that a small amount of code generates a large amount of changes to the data section, which ends up making the wasm potentially a lot bigger. However, that might still be useful in some cases that care more about startup time.

second = test(3000, 1000, 2000, 0xf, 0x8f5) # noqa; 1 will succedd
third = test(2000, 3000, 1000, 0xf, 0xf58) # noqa; 0 will succeed
print(first, second, third)
assert first < second and second < third, [first, second, third]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertLess(first, second)
self.assertLess(second, third)

# cmd += get_binaryen_feature_flags()
# check_call(cmd)
def eval_ctors(js_file, wasm_file, debug_info=False): # noqa
WASM_CALL_CTORS = '__wasm_call_ctors'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have emscripten.py:WASM_INIT_FUNC can we unify these?

args = ['--ctors=' + ctor, '--kept-exports=' + ctor]
if settings.EVAL_CTORS == 2:
args += ['--ignore-external-input']
logger.warning('ctor_evaller: trying to eval global ctors (' + ' '.join(args) + ')')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a warning right? Maybe just print(..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or logger.info

@@ -1481,7 +1518,7 @@ def run_binaryen_command(tool, infile, outfile=None, args=[], debug=False, stdou
cmd += get_binaryen_feature_flags()
# if we are emitting a source map, every time we load and save the wasm
# we must tell binaryen to update it
if settings.GENERATE_SOURCE_MAP and outfile:
if settings.GENERATE_SOURCE_MAP and outfile and tool in ['wasm-opt', 'wasm-emscripten-finalize']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only those tools support source maps. wasm-ctor-eval for example does not. I'll add a TODO

EVAL_CTORS
==========

Building with ``-s EVAL_CTORS`` will evaluate as much code as possible at
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should skip the space after -s these day in our docs. I know we are not consistent but its more concise and compatible (especially with cmake) so I think we should prefer it.

@kripken
Copy link
Member Author

kripken commented Jan 14, 2022

Thanks for the comments, I pushed commits to address them now.

@kripken kripken enabled auto-merge (squash) January 14, 2022 18:10
@kripken kripken merged commit 7240c3d into main Jan 14, 2022
@kripken kripken deleted the eval_ctors2 branch January 14, 2022 19:12
@emscripten-core emscripten-core deleted a comment Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for fast startup by restoring pre-initialized memory
2 participants