Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emscripten stack simplification #1870

Merged
merged 5 commits into from
Jan 16, 2019
Merged

Emscripten stack simplification #1870

merged 5 commits into from
Jan 16, 2019

Conversation

kripken
Copy link
Member

@kripken kripken commented Jan 15, 2019

This takes advantage of the recent memory simplification in emscripten, where JS static allocation is done at compile time. That means we know the stack's initial location at compile time, and can apply it. This is the binaryen side of that:

  • asm2wasm support for asm.js globals with an initial value var X = Y; where Y is not 0 (which is what the stack now is).
  • wasm-emscripten-finalize support for a flag to set the stack location, --stack-base=X, and remove the old code to import the stack's initial location.

@@ -75,11 +78,16 @@ int main(int argc, const char *argv[]) {
const std::string &argument) {
numReservedFunctionPointers = std::stoi(argument);
})
.add("--global-base", "", "Where lld started to place globals",
.add("--global-base", "", "Where lld should start to place static globals",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command seems wrong. lld has already assigned all its globals before finalize is called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we pass the global base to both wasm-emscripten-finalize and to lld - is lld doing it in both bitcode linking and wasm object files? If so I can remove the flag to finalize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I can see that finalize needs to know this is because it wants to write "staticBump" to the metadata output about the binary. as long as we need staticBump we will to continue to pass this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, thanks. Ok, fixing the comment.

Options::Arguments::One,
[&globalBase](Options*, const std::string&argument ) {
globalBase = std::stoull(argument);
})
.add("--stack-base", "", "Where the stack begins",
Options::Arguments::One,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this new argument? Won't the existing stack pointer assigned by lld be correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tell lld the size of the stack, but not its absolute location. Instead, finalize adds a wasm import of STACKTOP, and JS sends it over, and then the import gets assigned to the stack pointer lld created (which is mutable, so that way we couldn't assign it directly). This PR changes that, to just hardcode an initial value for the stack pointer, avoiding the import.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, are you saying, lld guesses the location of the stack (right after static allocations), and we can just leave it?

Even so, that wouldn't be right, as emscripten can add more static allocations after lld (in the JS compiler). Even without that, it seems safest to have one location that decides this stuff (Memory() in emscripten.py) and informs everything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in this new world, isn't the stack pointer already set by lld to the correct value? Does emscripten put stuff on the stack at startup? If so, doesn't it already have a way of moving the stack pointer by calling stackSave/stackRestore?

Perhaps this argument should be called "--initial-stack-pointer"?

I'm happy with the PR going in either way as I think its a step forward, I'm just trying to see possible improvements going foreword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lld's guess at the stack pointer location is incorrect due to static allocations from JS (see my last comment, may have been a race).

Emscripten won't use the stack from JS, as the wasm controls it - it needs to call into wasm to get the stack pointer and modify it. So it can modify it after the wasm is ready and the program started up.

Good idea to rename this to --initial-stack-pointer, done.

Copy link
Member

@sbc100 sbc100 Jan 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the memory layout used by lld/wasm-backend is

1.  zero page
2.  static initialized data
3.  bss data
4.  stack (grows down)
5.  heap 

I believe that emscripten adds global data after (3), which mean its clobber the end of the stack (since the stack growns down). So I believe the inital stack pointer set by lld (which grows down) will always be correct... but I'm not totally sure about all this right now.

Happy for this change to land as is and we can try to simplify and possible remove this argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, emscripten adds more stuff after 3. But it keeps the stack size fixed while doing so, so the initial stack location becomes higher - it pushes up the entire range reserved for the stack, top and bottom.

@kripken kripken merged commit 777d33d into master Jan 16, 2019
@kripken kripken deleted the stack branch January 16, 2019 21:22
kripken added a commit to emscripten-core/emscripten that referenced this pull request Jan 16, 2019
Part of #7795. Depends on WebAssembly/binaryen#1870 (has a hardcoded binaryen port value for testing, before landing needs to be a new tag there)

This is the emscripten side of that PR:

* For asm2wasm, define STACKTOP/STACK_MAX in the asm.js code with the hardcoded value directly.
* For wasm-emscripten-finalize, send it using --initial-stack-pointer.

This saves 1 or 2 imports in the wasm, and JS to send it there. Not a huge savings in code size (16 bytes in hello world), but it is simpler and also simpler in the compiler code too.

* bump abi

* fix test_stack_varargs - in the wasm backend, 2K is not enough for the single printf, it overflows. double the stack and double the stack uses

* update binaryen port
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants