-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows fprs preservation #1216
Windows fprs preservation #1216
Conversation
; nextln: store notrap aligned v8, v7 | ||
; nextln: store notrap aligned v9, v7+16 | ||
; check: v10 = stack_addr.i64 ss0 | ||
; nextln: v11 = load.f64 notrap aligned v10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a comment in the original PR referencing a line I couldn't get to test properly here, but would make me happier these tests being robust in the face of the future.
The exact text doesn't match anymore but the same idea applies. I would like to test that preserves and restores have the right registers. In theory a nextln
like
; nextln: [Mp2fld#710,%xmm6] v11 = load.f64 notrap aligned v10
would do the trick, but this yields a test error like
> [RexOp1spaddr8_id#808d,%rbp] v10 = stack_addr.i64 ss0
^~~~~~~~~~~~~~~~~~~~~~~~
Matched #10: \bv10 = stack_addr\.i64 ss0\b
> [Mp2fld#710,%xmm6] v11 = load.f64 notrap aligned v10
Missed #11: \[Mp2fld\#710,%xmm6\] v11 = load\.f64 notrap aligned v10\b
which is suspicious because the only difference on the matching lines is whitespace. When whitespace is equalized, this still yields an error. We use specific checks like this in other filetests also run as test compile
, so I suspect this is a bug in the filetest runner?
I would greatly appreciate thoughts if anyone is particularly familiar with filetest
.
…solved (bytecodealliance#1224)" This reverts commit 589fa95.
fbd067a
to
b55a43d
Compare
The CI failure is due to tripping this check that we wouldn't misreport unwind information in cases where we have callee-save floating-point registers. I tried reproducing this locally by building the module in Anyway, I printed the faulting prologue in CI to double-check that the panic is correctly hit, and indeed a function in the module uses well over the 240-byte limit I've noted above:
Since this function uses callee-save floats, we miscompile it on Windows today. Should I disable this test on Windows? (not sure if there's a mechanism for this in spectests, honestly) It would be fixed in the same PR that adjusts stack layout to not be constrained to 240 byte frames when using callee-save xmm registers. Edit: this panic is only reached if you want a fastcall function with all of:
This PR fixes callee-save float preservation in fastcall functions generally. Asking for unwind information causes the panic. I'm not sure how easily Windows targets can get the former without the latter. |
Nudging this again as I think this is as good as this fix can get without getting into the weeds on stack layout details. I have three(ish) issues to open assuming a 👍 :
|
Subscribe to Label ActionThis issue or pull request has been labeled: "cranelift" Users Subscribed to "cranelift"To subscribe or unsubscribe from this label, edit the |
Pardon the rebase, just want to get this on a more recent |
@iximeow I think you forgot to push :) |
5c0e249
to
e71b7fe
Compare
@bjorn3 even worse, I'd pushed to the wrong remote. Thanks for the heads up! |
It looks like there's at least a few more tests to disable due to the frame size limitation this adds to functions using Windows x64 calling convention. |
it passes 🎉 GitHub appears to believe that |
@iximeow I'll review this soon and if everyone signs off, I think we should merge this before my unwind refactoring changes. I can easily update that PR to integrate this PR's changes to save the FPRs in the windows unwind info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me 👍, even with the introduced stack size limit on Windows.
Just a few comments that aren't particularly blocking.
We probably want @bjorn3 or another core cranelift maintainer to sign off as well.
| ("multi_value", "func") => { | ||
// FIXME These involves functions with very large stack frames that Cranelift currently | ||
// cannot compile using the fastcall (Windows) calling convention. | ||
// See https://github.com/bytecodealliance/wasmtime/pull/1216. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a new issue to track the stack layout for CSRs problem and link here instead of the PR? My concern is only that it's hard to make out what's relevant of a complex PR like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. That's the third item in my to-do list which I've been waiting for even a preliminary ✔️ to actually file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder to update this to https://github.com/bytecodealliance/wasmtime/issues/1475
.
I am actually not a maintainer. :) |
😱 my apologies, although your review is obviously always appreciated :) I'm not actually a core cranelift maintainer either, so I'm more comfortable when others sign off on bits I haven't been directly involved in. |
You are not the first to think I am a maintainer of a project. :) I often review PR's on projects I am (slightly) interested in. And sometimes the PR author assumes I am a maintainer. |
I'll ping @bnjbvr for the core maintainer sign-off, then. |
another temporary commit. Cranelift::spec::multi_value::func panics but this is not reproducible using clif-util targeting windows?
This reverts commit 29c6c40.
ExplicitSlot is noted as intended for use with stack_load and stack_store, which are morally equivalent to FPR save/restore
87b7017
to
4e1b10e
Compare
I had to force push to rebase and fix a merge conflict that crept in. Unwind-producing functions are |
Co-Authored-By: bjorn3 <bjorn3@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 on the result return change.
🎉 thanks for fixing this @iximeow! 🎉 I realize it was a slog of a PR. Now I get to merge these changes in to my refactoring PR and we'll see about prioritizing the fixing of the stack layout of the register saves 😄 |
Awesome, thanks for getting this done! |
This reverts commit 4cca510.
Moved from bytecodealliance/cranelift#1378.
If you don't know who could review this, please indicate so and/or ping
bnjbvr
. The list of suggested reviewers on the right can help you.There were a few unresolved conversations from the original PR that I'll copy below.