Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process. #103830

Closed
wants to merge 1 commit into from

Conversation

pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Nov 1, 2022

Commit 88e4d2c from five years ago removed
support for alignment on indirectly-passed arguments because of problems with
the i686-pc-windows-msvc target. Unfortunately, the memcpy optimizations I
recently added to LLVM 16 depend on this to forward memcpys. This commit
attempts to fix the problems with byval parameters on that target and now
correctly adds the align attribute.

The problem is summarized in this comment by @eddyb. Briefly, 32-bit x86 has
special alignment rules for byval parameters: for the most part, their
alignment is forced to 4. This is not well-documented anywhere but in the Clang
source. I looked at the logic in Clang TargetInfo.cpp and tried to replicate
it here. The relevant methods in that file are
X86_32ABIInfo::getIndirectResult() and
X86_32ABIInfo::getTypeStackAlignInBytes(). The align parameter attribute
for byval parameters in LLVM must match the platform ABI, or miscompilations
will occur. Note that this doesn't use the approach suggested by eddyb, because
I felt it was overkill to store the alignment in on_stack when special
handling is really only needed for 32-bit x86.

As a side effect, this should fix #80127, because it will make the align
parameter attribute for byval parameters match the platform ABI on LLVM
x86-64.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 1, 2022
@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 1, 2022

By the way, https://twitter.com/pcwalton/status/1587282342174871552 shows the effect of this change on the std::io::stdio::_print function.

@rust-log-analyzer

This comment has been minimized.

@BGR360
Copy link
Contributor

BGR360 commented Nov 1, 2022

I think this will have the unintended effect of fixing #80127. It is the exact same change that was proposed by #80822, which never went in (I think because someone found that a couple of codegen tests failed on windows).

It is also the same change I made to our fork of rustc at work so that we could work around the problem of #80127. So anecdotally, this change is known to fix the problem.

I think we might not want to accidentally fix that bug though, because as @eddyb states here (paraphrasing): we should not be depending on this field in codegen for getting the correct ABI alignment, because it's intended only for optimizations. Something like that.

I just worry about having #80127 closed as fixed because of this change and then one day down the road somebody does something different with this field for optimization purposes, and having x86_64 ABI break again on stable.

Of course, @pcwalton, if you feel up to it, you could take a stab at actually fixing that bug in the "right way." @bjorn3 left some pretty good breadcrumbs in #80127. I also spent some time a while back trying to follow the breadcrumbs all the way home, and IIRC I got somewhat close to a working solution, but didn't have enough cycles to see it through. Would be happy to chat (zulip, zoom, whatever) to share what I can remember.

@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 2, 2022

Hmm, I see, the bug is still there. I think to be safe I'm going to move the check to be Windows only then.

@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 2, 2022

@BGR360 OK, I attempted to fix the problem correctly. This uses Clang's source code as a basis.

r? @eddyb

@rustbot rustbot assigned eddyb and unassigned oli-obk Nov 2, 2022
@pcwalton pcwalton changed the title rustc_target: Add back alignment to indirectly-passed by-value types. rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process. Nov 2, 2022
@rust-log-analyzer

This comment has been minimized.

@@ -658,7 +662,8 @@ impl<'a, Ty> FnAbi<'a, Ty> {
{
if abi == spec::abi::Abi::X86Interrupt {
if let Some(arg) = self.args.first_mut() {
arg.make_indirect_byval();
// FIXME(pcwalton): This probably should use the x86 `byval` ABI...
Copy link
Member

@bjorn3 bjorn3 Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For extern "x86-interrupt" fn(_: InterruptStackFrame) the argument is passed as pointer without byval attribute AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying I should change make_indirect_byval() to make_indirect() here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM requires this argument to be marked byval. You'll get a verifier error otherwise.

Copy link
Contributor

@BGR360 BGR360 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a regression test for the bug in #80127?

@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 2, 2022

Can you please add a regression test for the bug in #80127?

Done.


extern "C" {
// CHECK: declare void @f({{.*}}byval(%Foo) align 16{{.*}})
fn f(foo: Foo);
Copy link
Contributor

@BGR360 BGR360 Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to exceed a certain number of parameters for the original bug to surface. Enough to fill up the registers. I think the function signature in #80127 was the minimal repro; I remember trying to minimize it further and having not much luck.

I'm trying to use godbolt to figure it out but I'm having frustrating usability problems, and my personal device isn't x86_64 -_-

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think testing for the align attribute directly is better, because the test case in #80127 might randomly start to succeed if future changes to regalloc cause things to accidentally align, no?

Copy link
Contributor

@BGR360 BGR360 Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a regression test it should never start to succeed, it could only ever start to fail right? I guess I'm confused by what you're saying.

Also, my memory is quite fuzzy, but I thought that part of the underlying cause of the bug was that our logic for choosing how to pass args on x86_64 was wrong for stack arguments (which would require your test to have enough arguments to get to that point I think). But judging by you saying "I think testing for the align attribute directly is better", is it just that we were never putting align on any of the byval args?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk, I think I fundamentally don't trust a test that just checks the LLVM IR, because I don't understand it well enough for that to be convincing over a real integration test that proves that the C code can operate with the Rust code. But I recognize that that's probably quite hard to test in the rust repo. Perhaps someone else knows a way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a regression test it should never start to succeed, it could only ever start to fail right? I guess I'm confused by what you're saying.

I'm saying it could get broken accidentally but we wouldn't notice it because it could accidentally start to pass. I can add an interop test though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be awesome. Thank you for tackling this bug that has existed for so so long 🙏🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test.

@nikic
Copy link
Contributor

nikic commented Nov 2, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 2, 2022
@bors
Copy link
Contributor

bors commented Nov 2, 2022

⌛ Trying commit 07547fc706294bdaafc9c7f3359e8feb1cb0fae0 with merge ccd8e3c14de53ce413a7d2649c12d558d226226d...

@bors
Copy link
Contributor

bors commented Nov 3, 2022

☀️ Try build successful - checks-actions
Build commit: ccd8e3c14de53ce413a7d2649c12d558d226226d (ccd8e3c14de53ce413a7d2649c12d558d226226d)

@rust-timer
Copy link
Collaborator

Queued ccd8e3c14de53ce413a7d2649c12d558d226226d with parent edf0182, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ccd8e3c14de53ce413a7d2649c12d558d226226d): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-2.7%, -1.6%] 6
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
-3.8% [-4.2%, -3.3%] 2
All ❌✅ (primary) -0.1% [-0.1%, -0.1%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 3, 2022
@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 3, 2022

Sweet! I was worried it would be a regression.

@pcwalton
Copy link
Contributor Author

pcwalton commented Nov 4, 2022

This should be ready to go, I think. r? @oli-obk

@rustbot rustbot assigned oli-obk and unassigned eddyb Nov 4, 2022
jyn514 added a commit to jyn514/rust that referenced this pull request Dec 31, 2022
rustc_target: Add alignment to indirectly-passed by-value types, correcting the  alignment of `byval` on x86 in the process.

Commit 88e4d2c from five years ago removed
support for alignment on indirectly-passed arguments because of problems with
the `i686-pc-windows-msvc` target. Unfortunately, the `memcpy` optimizations I
recently added to LLVM 16 depend on this to forward `memcpy`s. This commit
attempts to fix the problems with `byval` parameters on that target and now
correctly adds the `align` attribute.

The problem is summarized in [this comment] by `@eddyb.` Briefly, 32-bit x86 has
special alignment rules for `byval` parameters: for the most part, their
alignment is forced to 4. This is not well-documented anywhere but in the Clang
source. I looked at the logic in Clang `TargetInfo.cpp` and tried to replicate
it here. The relevant methods in that file are
`X86_32ABIInfo::getIndirectResult()` and
`X86_32ABIInfo::getTypeStackAlignInBytes()`. The `align` parameter attribute
for `byval` parameters in LLVM must match the platform ABI, or miscompilations
will occur. Note that this doesn't use the approach suggested by eddyb, because
I felt it was overkill to store the alignment in `on_stack` when special
handling is really only needed for 32-bit x86.

As a side effect, this should fix rust-lang#80127, because it will make the `align`
parameter attribute for `byval` parameters match the platform ABI on LLVM
x86-64.

[this comment]: rust-lang#80822 (comment)
@bors
Copy link
Contributor

bors commented Jan 12, 2023

🔒 Merge conflict

This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again.

How do I rebase?

Assuming self is your fork and upstream is this repository, you can resolve the conflict following these steps:

  1. git checkout pointee-align (switch to your branch)
  2. git fetch upstream master (retrieve the latest master)
  3. git rebase upstream/master -p (rebase on top of it)
  4. Follow the on-screen instruction to resolve conflicts (check git status if you got lost).
  5. git push self pointee-align --force-with-lease (update this PR)

You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial.

Please avoid the "Resolve conflicts" button on GitHub. It uses git merge instead of git rebase which makes the PR commit history more difficult to read.

Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Cargo.lock conflict is handled during merge and rebase. This is normal, and you should still perform step 5 to update this PR.

Error message
CONFLICT (file location): src/test/run-make-fulldeps/extern-fn-explicit-align/test.rs added in heads/homu-tmp inside a directory that was renamed in HEAD, suggesting it should perhaps be moved to tests/run-make-fulldeps/extern-fn-explicit-align/test.rs.
CONFLICT (file location): src/test/run-make-fulldeps/extern-fn-explicit-align/test.c added in heads/homu-tmp inside a directory that was renamed in HEAD, suggesting it should perhaps be moved to tests/run-make-fulldeps/extern-fn-explicit-align/test.c.
CONFLICT (file location): src/test/run-make-fulldeps/extern-fn-explicit-align/Makefile added in heads/homu-tmp inside a directory that was renamed in HEAD, suggesting it should perhaps be moved to tests/run-make-fulldeps/extern-fn-explicit-align/Makefile.
CONFLICT (file location): src/test/codegen/align-byval.rs added in heads/homu-tmp inside a directory that was renamed in HEAD, suggesting it should perhaps be moved to tests/codegen/align-byval.rs.
CONFLICT (modify/delete): src/test/codegen/function-arguments.rs deleted in HEAD and modified in heads/homu-tmp. Version heads/homu-tmp of src/test/codegen/function-arguments.rs left in tree.
Auto-merging compiler/rustc_target/src/abi/call/mod.rs
warning: inexact rename detection was skipped due to too many files.
warning: you may want to set your merge.renamelimit variable to at least 27566 and retry the command.
Automatic merge failed; fix conflicts and then commit the result.

// ignore-spir
// ignore-spir64
// ignore-kalimba
// ignore-shave
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really fragile. I'd suggest to implement this in the same way as https://github.com/rust-lang/rust/blob/master/tests/codegen/abi-repr-ext.rs, i.e. a no_core test explicitly specifying targets. That should also allow you to test a non-windows target for the issue below.

@nikic
Copy link
Contributor

nikic commented Jan 22, 2023

@bors r- Got back into the queue due to bors sync probably

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 13, 2023
Improve the `array::map` codegen

The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](https://github.com/rust-lang/rust/blob/7c46fb2111936ad21a8e3aa41e9128752357f5d8/library/core/src/array/mod.rs#L865-L912) function, I had some ideas that ended up working better than I'd expected.

There's three main ideas in here, split over three commits:
1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily
2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant)
3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`).  This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward.

(No libs-api changes; everything should be completely implementation-detail-internal.)

It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (rust-lang#103830) to get further -- but this seems to go much better than before.  And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞

r? `@thomcc`

As a simple example, this test
```rust
pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] {
    x.map(|x| 13 * x + 7)
}
```
On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808`
```llvm
start:
  %array.i.i.i.i = alloca [64 x i32], align 4
  %_3.sroa.5.i.i.i = alloca [65 x i32], align 4
  %_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8
```
(and yes, that's a 6**5**-element array `alloca` despite 6**4**-element input and output)

But with this PR it's only `sub rsp, 520`
```llvm
start:
  %array.i.i.i.i.i.i = alloca [64 x i32], align 4
  %array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4
```

Similarly, the loop it emits on nightly is scalar-only and horrifying
```nasm
.LBB0_1:
        mov     esi, 64
        mov     edi, 0
        cmp     rdx, 64
        je      .LBB0_3
        lea     rsi, [rdx + 1]
        mov     qword ptr [rsp + 784], rsi
        mov     r8d, dword ptr [rsp + 4*rdx + 528]
        mov     edi, 1
        lea     edx, [r8 + 2*r8]
        lea     r8d, [r8 + 4*rdx]
        add     r8d, 7
.LBB0_3:
        test    edi, edi
        je      .LBB0_11
        mov     dword ptr [rsp + 4*rcx + 272], r8d
        cmp     rsi, 64
        jne     .LBB0_6
        xor     r8d, r8d
        mov     edx, 64
        test    r8d, r8d
        jne     .LBB0_8
        jmp     .LBB0_11
.LBB0_6:
        lea     rdx, [rsi + 1]
        mov     qword ptr [rsp + 784], rdx
        mov     edi, dword ptr [rsp + 4*rsi + 528]
        mov     r8d, 1
        lea     esi, [rdi + 2*rdi]
        lea     edi, [rdi + 4*rsi]
        add     edi, 7
        test    r8d, r8d
        je      .LBB0_11
.LBB0_8:
        mov     dword ptr [rsp + 4*rcx + 276], edi
        add     rcx, 2
        cmp     rcx, 64
        jne     .LBB0_1
```

whereas with this PR it's unrolled and vectorized
```nasm
	vpmulld	ymm1, ymm0, ymmword ptr [rsp + 64]
	vpaddd	ymm1, ymm1, ymm2
	vmovdqu	ymmword ptr [rsp + 328], ymm1
	vpmulld	ymm1, ymm0, ymmword ptr [rsp + 96]
	vpaddd	ymm1, ymm1, ymm2
	vmovdqu	ymmword ptr [rsp + 360], ymm1
```
(though sadly still stack-to-stack)
@pnkfelix
Copy link
Member

pnkfelix commented Mar 3, 2023

Visiting as part of looking at #80127 for 2023 Q1 P-high triage.

@pcwalton , do you think you'll have time soon to incorporate the feedback from @wesleywiser and @nikic ? If not, maybe one of us can take care of those modifications...

@Dylan-DPC
Copy link
Member

@pcwalton any updates on this? thanks

@nikic
Copy link
Contributor

nikic commented May 17, 2023

Superseded by #111551.

@nikic nikic closed this May 17, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 15, 2023
Resurrect: rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process.

Same as rust-lang#111551, which I [accidentally closed](rust-lang#111551 (comment)) :/

---

This resurrects PR rust-lang#103830, which has sat idle for a while.

Beyond rust-lang#103830, this also:
- fixes byval alignment for types containing vectors on Darwin (see `tests/codegen/align-byval-vector.rs`)
- fixes byval alignment for overaligned types on x86 Windows (see `tests/codegen/align-byval.rs`)
- fixes ABI for types with 128bit requested alignment on ARM64 Linux (see `tests/codegen/aarch64-struct-align-128.rs`)

r? `@nikic`

---

`@pcwalton's` original PR description is reproduced below:

Commit 88e4d2c from five years ago removed
support for alignment on indirectly-passed arguments because of problems with
the `i686-pc-windows-msvc` target. Unfortunately, the `memcpy` optimizations I
recently added to LLVM 16 depend on this to forward `memcpy`s. This commit
attempts to fix the problems with `byval` parameters on that target and now
correctly adds the `align` attribute.

The problem is summarized in [this comment] by `@eddyb.` Briefly, 32-bit x86 has
special alignment rules for `byval` parameters: for the most part, their
alignment is forced to 4. This is not well-documented anywhere but in the Clang
source. I looked at the logic in Clang `TargetInfo.cpp` and tried to replicate
it here. The relevant methods in that file are
`X86_32ABIInfo::getIndirectResult()` and
`X86_32ABIInfo::getTypeStackAlignInBytes()`. The `align` parameter attribute
for `byval` parameters in LLVM must match the platform ABI, or miscompilations
will occur. Note that this doesn't use the approach suggested by eddyb, because
I felt it was overkill to store the alignment in `on_stack` when special
handling is really only needed for 32-bit x86.

As a side effect, this should fix rust-lang#80127, because it will make the `align`
parameter attribute for `byval` parameters match the platform ABI on LLVM
x86-64.

[this comment]: rust-lang#80822 (comment)
bjorn3 pushed a commit to bjorn3/rust that referenced this pull request Jul 22, 2023
Resurrect: rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process.

Same as rust-lang#111551, which I [accidentally closed](rust-lang#111551 (comment)) :/

---

This resurrects PR rust-lang#103830, which has sat idle for a while.

Beyond rust-lang#103830, this also:
- fixes byval alignment for types containing vectors on Darwin (see `tests/codegen/align-byval-vector.rs`)
- fixes byval alignment for overaligned types on x86 Windows (see `tests/codegen/align-byval.rs`)
- fixes ABI for types with 128bit requested alignment on ARM64 Linux (see `tests/codegen/aarch64-struct-align-128.rs`)

r? `@nikic`

---

`@pcwalton's` original PR description is reproduced below:

Commit 88e4d2c from five years ago removed
support for alignment on indirectly-passed arguments because of problems with
the `i686-pc-windows-msvc` target. Unfortunately, the `memcpy` optimizations I
recently added to LLVM 16 depend on this to forward `memcpy`s. This commit
attempts to fix the problems with `byval` parameters on that target and now
correctly adds the `align` attribute.

The problem is summarized in [this comment] by `@eddyb.` Briefly, 32-bit x86 has
special alignment rules for `byval` parameters: for the most part, their
alignment is forced to 4. This is not well-documented anywhere but in the Clang
source. I looked at the logic in Clang `TargetInfo.cpp` and tried to replicate
it here. The relevant methods in that file are
`X86_32ABIInfo::getIndirectResult()` and
`X86_32ABIInfo::getTypeStackAlignInBytes()`. The `align` parameter attribute
for `byval` parameters in LLVM must match the platform ABI, or miscompilations
will occur. Note that this doesn't use the approach suggested by eddyb, because
I felt it was overkill to store the alignment in `on_stack` when special
handling is really only needed for 32-bit x86.

As a side effect, this should fix rust-lang#80127, because it will make the `align`
parameter attribute for `byval` parameters match the platform ABI on LLVM
x86-64.

[this comment]: rust-lang#80822 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rustc has wrong signature for C function with 16-byte aligned stack argument in x86_64 Linux