Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursion using extern "C" causes miscompilation #114312

Closed
AngelicosPhosphoros opened this issue Jul 31, 2023 · 19 comments · Fixed by #114847
Closed

Recursion using extern "C" causes miscompilation #114312

AngelicosPhosphoros opened this issue Jul 31, 2023 · 19 comments · Fixed by #114847
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-critical Critical priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Milestone

Comments

@AngelicosPhosphoros
Copy link
Contributor

AngelicosPhosphoros commented Jul 31, 2023

I tried this code:

pub enum Expr {
    Lit(isize),
    Sum(isize, isize),
    Sub(isize, isize),
}

pub fn eval_slow(expr: Expr) -> isize {
    use Expr::*;
    extern "C" fn eval_inner(expr: Expr)->isize{
        match expr {
            Lit(x) => x,
            Sum(x, y) => x + y,
            Sub(x, y) => eval_inner(Sum(x, -y)),
        }
    }
    eval_inner(expr)
}

I expected to see this happen: compiled function correctly handles Sub(x, y)

Instead, this happened: compiler generates code with infinite loop.

See asm (from -Copt-level=3):

example::eval_slow:
        mov     rax, qword ptr [rdi]
        test    rax, rax
        je      .LBB0_3
        cmp     eax, 1
        jne     .LBB0_2
        mov     rax, qword ptr [rdi + 16]
        add     rax, qword ptr [rdi + 8]
        ret
.LBB0_2:                            ;  <--------- Infinite loop header
        jmp     .LBB0_2          ;  <--------- Unconditional jump to previous instruction
.LBB0_3:
        mov     rax, qword ptr [rdi + 8]
        ret

Meta

rustc --version --verbose:

rustc 1.71.0 (8ede3aae2 2023-07-12)
binary: rustc
commit-hash: 8ede3aae28fe6e4d52b38157d7bfe0d3bceef225
commit-date: 2023-07-12
host: x86_64-unknown-linux-gnu
release: 1.71.0
LLVM version: 16.0.5
Compiler returned: 0
@AngelicosPhosphoros AngelicosPhosphoros added the C-bug Category: This is a bug. label Jul 31, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 31, 2023
@AngelicosPhosphoros AngelicosPhosphoros changed the title Recursion using extern "C" caused miscompilation Recursion using extern "C" causes miscompilation Jul 31, 2023
@AngelicosPhosphoros
Copy link
Contributor Author

It seems, that bug was introduced between 1.69 and 1.70:

godbolt

@compiler-errors compiler-errors added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Jul 31, 2023
@asquared31415
Copy link
Contributor

I am not particularly well versed in miscompilations, but the IR we emit to LLVM looks fine. Investigating more later, but I suspect an LLVM update affects this.

@lukas-code
Copy link
Member

reduced LLVM IR: https://godbolt.org/z/Yncqrjfr7

define void @foo(ptr noalias byval(i64) %x) {
start:
  %new_x = alloca i64, align 8
  %x_val = load i64, ptr %x, align 8
  %is_zero = icmp eq i64 %x_val, 0
  br i1 %is_zero, label %end, label %recurse

recurse:
  store i64 0, ptr %new_x, align 8
  call void @foo(ptr %new_x)
  br label %end

end:
  ret void
}

optimizes to

define void @foo(ptr noalias nocapture readonly byval(i64) %x) {
  %x_val = load i64, ptr %x, align 8
  %is_zero = icmp eq i64 %x_val, 0
  br i1 %is_zero, label %end, label %recurse

recurse:
  br label %recurse

end:
  ret void
}

Needs noalias + byval to reproduce. Looks like the tail call optimization pass writes into the argument, even thou it is readonly at that point.

Did not bisect, but this was probably introduced by the upgrade to LLVM 16: #109474

@rustbot label A-llvm I-unsound

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 31, 2023
@asquared31415
Copy link
Contributor

If anyone's interested in a Rust level test, it can be reduced a decent amount:

#[repr(C)]
pub enum Expr {
    Sum,
    // must have more than usize data
    Sub(usize, u8),
}

pub extern "C" fn eval_inner(expr: Expr) {
    match expr {
        Expr::Sum => {}
        Expr::Sub(_, _) => eval_inner(Expr::Sum),
    }
}

godbolt

looking at that LLVM IR it could probably be reduced more, but really at this point there's not a whole lot left to remove.

@erikdesjardins
Copy link
Contributor

Yeah, it looks like tailcallelim introducing a write to a readonly parameter is the culprit.

Strangely, Alive2 is okay with writing to a readonly byval argument, and only thinks it's UB if it's non-byval: https://alive2.llvm.org/ce/z/T6yxU-.

This makes some sense semantically (since the hidden copy of the argument means modifications aren't visible to the caller), but seems to be in contradiction with the langref, which says:

The copy is considered to belong to the caller not the callee (for example, readonly functions should not write to byval parameters).

Opened llvm/llvm-project#64289

@Noratrieb Noratrieb added P-critical Critical priority and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Aug 1, 2023
@lqd
Copy link
Member

lqd commented Aug 1, 2023

Bisected to upgrade to LLVM16 #109474
Seems to also repro on the most recent LLVM17 try build from #114048: 8030d71a95a3ea79f5fc95232c32f9b78effb92d

@lqd lqd removed the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Aug 1, 2023
@DianQK
Copy link
Member

DianQK commented Aug 1, 2023

Bisected: llvm/llvm-project@01859da.
Here are some possibly relevant discussions: https://reviews.llvm.org/D136659.

@DianQK
Copy link
Member

DianQK commented Aug 8, 2023

@rustbot claim

@cuviper
Copy link
Member

cuviper commented Aug 9, 2023

@rustbot label +T-compiler +I-compiler-nominated

Nominating for compiler discussion, because @DianQK is proposing the LLVM fix for backport to our 16 branch in rust-lang/llvm-project#150, in hopes of getting this into 1.72-beta. We don't really have a policy mechanism for this since master has already moved to LLVM 17, but we could accept that LLVM 16 backport in our fork and then update the LLVM 16 submodule directly in beta.

(That would also pull in #114105.)

@rustbot rustbot added I-compiler-nominated Nominated for discussion during a compiler team meeting. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 9, 2023
@DianQK
Copy link
Member

DianQK commented Aug 9, 2023

I think this backport was necessary. It may have led to other undetected end-to-end miscompilation.

(That would also pull in #114105.)

Something unexpected seems to have happened. Should we create a new rustc branch of llvm-project every time we create a beta branch?

@DianQK
Copy link
Member

DianQK commented Aug 9, 2023

@rustbot label +regression-from-stable-to-stable

@rustbot rustbot added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Aug 9, 2023
@cuviper
Copy link
Member

cuviper commented Aug 9, 2023

(That would also pull in #114105.)

Something unexpected seems to have happened. Should we create a new rustc branch of llvm-project every time we create a beta branch?

It doesn't come up often, but we always have the option of adding a branch if needed. I think this particular case is innocuous enough to allow, but I didn't want it to sneak in without notice.

@DianQK
Copy link
Member

DianQK commented Aug 9, 2023

(That would also pull in #114105.)

Something unexpected seems to have happened. Should we create a new rustc branch of llvm-project every time we create a beta branch?

It doesn't come up often, but we always have the option of adding a branch if needed. I think this particular case is innocuous enough to allow, but I didn't want it to sneak in without notice.

Okay, I can reopen a new PR for the new branch if needed.

@wesleywiser
Copy link
Member

Nominating for compiler discussion, because @DianQK is proposing the LLVM fix for backport to our 16 branch in rust-lang/llvm-project#150, in hopes of getting this into 1.72-beta. We don't really have a policy mechanism for this since master has already moved to LLVM 17, but we could accept that LLVM 16 backport in our fork and then update the LLVM 16 submodule directly in beta.

(That would also pull in #114105.)

Thanks for nominating @cuviper! We discussed this during the compiler triage meeting today and think that accepting a beta-backport of rust-lang/llvm-project#150 would be best to mitigate this miscompilation. We would prefer not to take #114105 at the same time and would rather let that ride the normal release train if that's possible?

@apiraino apiraino removed the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Aug 10, 2023
@DianQK
Copy link
Member

DianQK commented Aug 10, 2023

Since the master branch has been switched to LLVM 17. Possibly revert rust-lang/llvm-project#148 at branch rustc/16.0-2023-06-05 is also fine?

@cuviper
Copy link
Member

cuviper commented Aug 10, 2023

We would prefer not to take #114105 at the same time and would rather let that ride the normal release train if that's possible?

In a way, that PR's backport got orphaned because its version of the submodule was derailed from the release train, replaced by our LLVM 17 update. But the original change it wanted is part of 17, so it will ride the train that way.

Possibly revert rust-lang/llvm-project#148 at branch rustc/16.0-2023-06-05 is also fine?

Yes, I think that sounds fine. Please go ahead and add that revert to rust-lang/llvm-project#150, noting these reasons in the revert commit message.

When we apply the submodule update to beta, it should ideally come with a rust codegen test, and we'll also want that test to land on master when LLVM 17 is fixed. (And don't close this issue until then!)

@cuviper cuviper added this to the 1.72.0 milestone Aug 10, 2023
@DianQK
Copy link
Member

DianQK commented Aug 10, 2023

When we apply the submodule update to beta, it should ideally come with a rust codegen test, and we'll also want that test to land on master when LLVM 17 is fixed. (And don't close this issue until then!)

llvm/llvm-project-release-prs#556.
This patch has been backported to LLVM 17.
I will pick the test from #114312 (comment).

@wesleywiser
Copy link
Member

Awesome, thank you both @cuviper and @DianQK! 🥇

bors added a commit to rust-lang-ci/rust that referenced this issue Aug 12, 2023
[beta] Update LLVM to resolve a miscompilation found in 114312.

Related issue: rust-lang#114312 .

After the master updates the LLVM, we will add the same test cases. In the meantime, close the issue.
@cuviper
Copy link
Member

cuviper commented Aug 14, 2023

1.72-beta's LLVM 16 was fixed in #114726, so I'm moving the milestone to 1.73 until we update and test LLVM 17.

@cuviper cuviper modified the milestones: 1.72.0, 1.73.0 Aug 14, 2023
nikic pushed a commit to nikic/rust that referenced this issue Aug 15, 2023
@bors bors closed this as completed in 2bc7929 Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness P-critical Critical priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.