-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emscripten stack simplification #1870
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,6 +36,8 @@ using namespace cashew; | |
using namespace wasm; | ||
|
||
int main(int argc, const char *argv[]) { | ||
const uint64_t INVALID_BASE = -1; | ||
|
||
std::string infile; | ||
std::string outfile; | ||
std::string inputSourceMapFilename; | ||
|
@@ -46,7 +48,8 @@ int main(int argc, const char *argv[]) { | |
bool debugInfo = false; | ||
bool legalizeJavaScriptFFI = true; | ||
unsigned numReservedFunctionPointers = 0; | ||
uint64_t globalBase; | ||
uint64_t globalBase = INVALID_BASE; | ||
uint64_t stackBase = INVALID_BASE; | ||
Options options("wasm-emscripten-finalize", | ||
"Performs Emscripten-specific transforms on .wasm files"); | ||
options | ||
|
@@ -75,11 +78,16 @@ int main(int argc, const char *argv[]) { | |
const std::string &argument) { | ||
numReservedFunctionPointers = std::stoi(argument); | ||
}) | ||
.add("--global-base", "", "Where lld started to place globals", | ||
.add("--global-base", "", "Where lld should start to place static globals", | ||
Options::Arguments::One, | ||
[&globalBase](Options*, const std::string&argument ) { | ||
globalBase = std::stoull(argument); | ||
}) | ||
.add("--stack-base", "", "Where the stack begins", | ||
Options::Arguments::One, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need this new argument? Won't the existing stack pointer assigned by lld be correct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We tell lld the size of the stack, but not its absolute location. Instead, finalize adds a wasm import of STACKTOP, and JS sends it over, and then the import gets assigned to the stack pointer lld created (which is mutable, so that way we couldn't assign it directly). This PR changes that, to just hardcode an initial value for the stack pointer, avoiding the import. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, are you saying, lld guesses the location of the stack (right after static allocations), and we can just leave it? Even so, that wouldn't be right, as emscripten can add more static allocations after lld (in the JS compiler). Even without that, it seems safest to have one location that decides this stuff ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But in this new world, isn't the stack pointer already set by lld to the correct value? Does emscripten put stuff on the stack at startup? If so, doesn't it already have a way of moving the stack pointer by calling stackSave/stackRestore? Perhaps this argument should be called "--initial-stack-pointer"? I'm happy with the PR going in either way as I think its a step forward, I'm just trying to see possible improvements going foreword. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lld's guess at the stack pointer location is incorrect due to static allocations from JS (see my last comment, may have been a race). Emscripten won't use the stack from JS, as the wasm controls it - it needs to call into wasm to get the stack pointer and modify it. So it can modify it after the wasm is ready and the program started up. Good idea to rename this to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding is that the memory layout used by lld/wasm-backend is
I believe that emscripten adds global data after (3), which mean its clobber the end of the stack (since the stack growns down). So I believe the inital stack pointer set by lld (which grows down) will always be correct... but I'm not totally sure about all this right now. Happy for this change to land as is and we can try to simplify and possible remove this argument. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, emscripten adds more stuff after 3. But it keeps the stack size fixed while doing so, so the initial stack location becomes higher - it pushes up the entire range reserved for the stack, top and bottom. |
||
[&stackBase](Options*, const std::string&argument ) { | ||
stackBase = std::stoull(argument); | ||
}) | ||
|
||
.add("--input-source-map", "-ism", "Consume source map from the specified file", | ||
Options::Arguments::One, | ||
|
@@ -141,6 +149,12 @@ int main(int argc, const char *argv[]) { | |
uint32_t dataSize = 0; | ||
|
||
if (!isSideModule) { | ||
if (globalBase == INVALID_BASE) { | ||
Fatal() << "globalBase must be set"; | ||
} | ||
if (stackBase == INVALID_BASE) { | ||
Fatal() << "stackBase must be set"; | ||
} | ||
Export* dataEndExport = wasm.getExport("__data_end"); | ||
if (dataEndExport == nullptr) { | ||
Fatal() << "__data_end export not found"; | ||
|
@@ -189,7 +203,7 @@ int main(int argc, const char *argv[]) { | |
} else { | ||
generator.generateRuntimeFunctions(); | ||
generator.generateMemoryGrowthFunction(); | ||
generator.generateStackInitialization(); | ||
generator.generateStackInitialization(stackBase); | ||
// emscripten calls this by default for side libraries so we only need | ||
// to include in as a static ctor for main module case. | ||
if (wasm.getExportOrNull("__post_instantiate")) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command seems wrong. lld has already assigned all its globals before finalize is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we pass the global base to both wasm-emscripten-finalize and to lld - is lld doing it in both bitcode linking and wasm object files? If so I can remove the flag to finalize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I can see that finalize needs to know this is because it wants to write "staticBump" to the metadata output about the binary. as long as we need staticBump we will to continue to pass this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see, thanks. Ok, fixing the comment.