Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic code is not deterministically lto'ed #52044

Closed
glandium opened this issue Jul 4, 2018 · 34 comments
Closed

panic code is not deterministically lto'ed #52044

glandium opened this issue Jul 4, 2018 · 34 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glandium
Copy link
Contributor

glandium commented Jul 4, 2018

This is something I noticed when comparing two builds of Firefox on automation. Both builds were done with the same flags, same toolchains, same paths, same everything. Without any sort of caching. And yet, they differed in exactly one way: the contents of std::panicking::default_hook.

See https://taskcluster-artifacts.net/IWpRV77rTgmt-9RKwTzJiA/0/public/diff.html

I tried multiple times, and there seems to be only two variants of the generated code, and that's the only code, in all of what's generated by rust in Firefox build, that differs.

It turns out this is reproducible at a much smaller scale:

$ mkdir -p foo/src; cd foo
$ cat > Cargo.toml <<EOF
[package]
name = "foo"
version = "0.1.0"

[lib]
crate-type = ["staticlib"]

[profile.release]
opt-level = 2
rpath = false
debug-assertions = false
panic = "abort"
codegen-units = 1
lto = true
EOF
$ cat > src/lib.rs <<EOF
#[no_mangle]
pub extern "C" fn foo() {
    panic!("foo");
}
EOF
$ cargo build --release
$ mv target/release/libfoo.a .
$ rm -rf target
$ cargo build --release
$ diff target/release/libfoo.a .
Binary files target/release/libfoo.a and ./libfoo.a differ

It can take a few attempts repeating the last three commands before differences show up, because there are only two variants, and you may end up getting the same variant multiple times in a row.

Cc: @alexcrichton @nikomatsakis

@alexcrichton
Copy link
Member

Oh dear now this is a little terrifying. The following script also reproduces the error here and is a bit smaller in that it only uses rustc. This script generates only the IR and object file for the compilation at hand:

#!/bin/bash

set -ex

input="
#[no_mangle]
pub extern fn foo() {
    panic!(\"foo\");
}
"

rustc="rustc /dev/stdin -O -C lto -C panic=abort -C codegen-units=1"
rustc="$rustc --emit llvm-ir,obj --crate-type staticlib"

for i in `seq 1 100`; do
  rm -rf a b
  mkdir a b
  $rustc --out-dir a <<< $input
  $rustc --out-dir b <<< $input
  a=$(md5sum a/stdin.ll | awk '{print $1}')
  b=$(md5sum b/stdin.ll | awk '{print $1}')
  if [ "$a" != "$b" ]; then
    echo IR is different
    exit 1
  fi
  a=$(md5sum a/stdin.o | awk '{print $1}')
  b=$(md5sum b/stdin.o | awk '{print $1}')
  if [ "$a" != "$b" ]; then
    echo object is different
    exit 1
  fi
done

The scary part here is that the IR is the same when the object files are different. This means that somehow LLVM's code generator looks like it's being non-deterministic. I can't reproduce this with llc either, LLVM's standalone code generator.

Somehow this means that the in-process state of LLVM is nondeterministic in just the right way between compilations that it affects the final output file. As to how... that's a bit of a mystery.

This isn't at least an obvious determinism bug in rustc in that the IR is the same so we're feeding the same input to LLVM both times. It's seemingly something else that's going awry!

@alexcrichton
Copy link
Member

cc @michaelwoerister

@alexcrichton alexcrichton added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 4, 2018
@Mark-Simulacrum
Copy link
Member

Trace diff for rustc_codegen_llvm is non-empty but surface level appears to be just pointers changing addresses and hashes being different (still odd, but at least reasonable): https://gist.github.com/Mark-Simulacrum/5d04c7ad104aa8936d45c4ac71f301f3.

Trace diff for the whole compilation (RUST_LOG=trace) is here: https://gist.github.com/Mark-Simulacrum/a1dcc978a6b65cd48307ad9407c60cf9. This shows that there is certainly non-determinism inside rustc, but also does not look like there's anything obvious here.

This seems likely to have something to do with recent changes to panic_implementation though IIRC they didn't touch codegen directly. However, perhaps we're seeing some side effect of the specific code being generated -- the BoxMeUp and similar code is relatively low-level for Rust.

I cannot reproduce with a minimal no_std program within the ~100 iterations:

#![no_std]
#![feature(panic_implementation)]

use core::panic::PanicInfo;

#[panic_implementation]
fn rust_begin_panic(_info: &PanicInfo) -> ! {
   loop {}
}

#[no_mangle]
pub extern fn foo() {
    panic!(\"foo\");
}

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

FWIW, this doesn't reproduce with 1.25, which was the first version with LLVM 6.

@Mark-Simulacrum
Copy link
Member

I'll attempt a bisection tomorrow.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

This reproduces with nightly-2018-06-04, which is the first one with #50338 (panic_implementation) merged. But it also does reproduce with nightly-2018-06-03.

@alexcrichton
Copy link
Member

A local bisection points to a rollup, #49051, as the culprit. #48892 seems the most likely but even that seems like it'd be unrelated...

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

Both the merge of #49051 and the parent merge fail for me, so it seems your bisection went wrong.

@alexcrichton
Copy link
Member

The merge of #49051, 36b6687, reproduces via the script I wrote above for me locally. The previous merge of #47813, 3926453, did not reproduce via the same script after 100 executions of trying to get a different hash. Additionally using 3926453 I am unable to reproduce via the example you gave in the OP.

How are you testing 3926453? I'm using rustup-toolchain-install-master to download the prebuilt binaries and execute locally. Did you build from scratch?

To hopefully weed out weird build issues as the problem I've tested the next previous merge as well of #48138, ff2d506, but it also isn't reproducing the bug here.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

I built from scratch and tested both 36b6687 and 3926453, which both reproduced.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

Local bisection with builds from scratch point me to the upgrade to LLVM 6, quite consistently. Has the compiler used on automation to build C/C++ code changed later?

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

Mmmm it fails both when llvm is built with GCC 5.4 (which is what I was originally bisecting with) and GCC 4.8 (which seems to be what rust automation used for some old builds I looked at the logs for). I wonder if the fact that this is on Ubuntu, where GCC defaults to have some features enabled that upstream GCC doesn't makes a difference.

@michaelwoerister
Copy link
Member

At least some of our CI builds have switched to Clang 6.0 about two months ago.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

I was actually able to reproduce with llc-6.0 from Debian, from the llvm-6.0 package version 1:6.0.1-2, from the llvm-ir from latest rustc nightly.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

Unfortunately, llc-7 doesn't like the llvm-ir output from rustc nightly:

llc-7: a/stdin.ll:25470:373: error: invalid field 'variables'
!2237 = distinct !DISubprogram(name: "alloc", linkageName: "_ZN12alloc_system8platform75_$LT$impl$u20$core..alloc..GlobalAlloc$u20$for$u20$alloc_system..System$GT$5alloc17hcaa16a633af984e0E", scope: !2239, file: !2238, line: 135, type: !2242, isLocal: true, isDefinition: true, scopeLine: 135, flags: DIFlagPrototyped, isOptimized: true, unit: !1560, templateParams: !21, variables: !21)
                                                                                                                                                                                                                                                                                                                                                                                    ^

@alexcrichton
Copy link
Member

@glandium when you mentioned you can reproduce with llc-6.0, that's taking Rust's IR and running llc twice, getting different object files?

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

Yes, with the same code difference as what I can see with rust. But it takes more attempts than it does with rust.

@glandium
Copy link
Contributor Author

glandium commented Jul 4, 2018

FWIW, I've been able to reproduce under rr, so I /might/ be able to compare what's different between two runs, but if I make rr more deterministic by disabling ASLR, it goes away (but disabling ASLR without rr makes it still happen)

@glandium
Copy link
Contributor Author

glandium commented Jul 5, 2018

FWIW, this is what the stack traces look like that create the first different instructions:

cmp version
#0  llvm::SDNode::SDNode (this=0x55cae6cb3218, Opc=271, Order=7, dl=..., VTs=...)
    at ../include/llvm/CodeGen/SelectionDAGNodes.h:989
#1  0x000055cae31761a5 in llvm::SelectionDAG::newSDNode<llvm::SDNode, unsigned int&, unsigned int, llvm::DebugLoc const&, llvm::SDVTList> (this=0x55cae62a9a00)
    at ../include/llvm/CodeGen/SelectionDAG.h:319
#2  0x000055cae3185b99 in llvm::SelectionDAG::getNode (this=this@entry=0x55cae62a9a00, 
    Opcode=Opcode@entry=271, DL=..., VT=..., N1=..., N2=..., Flags=..., Flags@entry=...)
    at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4726
#3  0x000055cae26b9c37 in llvm::X86TargetLowering::EmitTest (this=this@entry=0x55cae6ba8d68, 
    Op=..., X86CC=X86CC@entry=4, dl=..., DAG=...) at ../lib/Target/X86/X86ISelLowering.cpp:17294
#4  0x000055cae26bbccc in llvm::X86TargetLowering::EmitCmp (this=this@entry=0x55cae6ba8d68, 
    Op0=..., Op1=..., X86CC=X86CC@entry=4, dl=..., DAG=...)
    at ../lib/Target/X86/X86ISelLowering.cpp:17309
#5  0x000055cae26bc0dc in llvm::X86TargetLowering::LowerSETCC (this=this@entry=0x55cae6ba8d68, 
    Op=..., DAG=...) at ../lib/Target/X86/X86ISelLowering.cpp:18148
#6  0x000055cae26bec4a in llvm::X86TargetLowering::LowerBRCOND (this=0x55cae6ba8d68, Op=..., 
    DAG=...) at ../lib/Target/X86/X86ISelLowering.cpp:19161
#7  0x000055cae26d038b in llvm::X86TargetLowering::LowerOperation (this=<optimized out>, Op=..., 
    DAG=...) at ../lib/Target/X86/X86ISelLowering.cpp:24746
#8  0x000055cae30efcf3 in (anonymous namespace)::SelectionDAGLegalize::LegalizeOp (
    this=0x7fff8e577cb0, Node=0x55cae6ca6de0) at ../lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1211
#9  0x000055cae30f4051 in llvm::SelectionDAG::Legalize (this=0x55cae62a9a00)
    at ../lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:4682
#10 0x000055cae31a581a in llvm::SelectionDAGISel::CodeGenAndEmitDAG (
    this=this@entry=0x55cae60209a0) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:834
#11 0x000055cae31a685a in llvm::SelectionDAGISel::SelectBasicBlock (
    this=this@entry=0x55cae60209a0, Begin=..., Begin@entry=..., End=..., End@entry=..., 
    HadTailCall=@0x7fff8e5785a0: false) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:664
#12 0x000055cae31b040f in llvm::SelectionDAGISel::SelectAllBasicBlocks (
    this=this@entry=0x55cae60209a0, Fn=...)
    at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#13 0x000055cae31b22f2 in llvm::SelectionDAGISel::runOnMachineFunction (this=<optimized out>, 
    mf=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#14 0x000055cae31b3aac in llvm::SelectionDAGISel::runOnMachineFunction (this=<optimized out>, 
    mf=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:628
#15 0x000055cae2604c74 in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction (
    this=<optimized out>, MF=...) at ../lib/Target/X86/X86ISelDAGToDAG.cpp:175
#16 0x000055cae2a58b35 in llvm::MachineFunctionPass::runOnFunction (this=0x55cae60209a0, F=...)
    at ../lib/CodeGen/MachineFunctionPass.cpp:62
#17 0x000055cae2d76fd0 in llvm::FPPassManager::runOnFunction (this=0x55cae6a59f10, F=...)
    at ../lib/IR/LegacyPassManager.cpp:1520
#18 0x000055cae2d77039 in llvm::FPPassManager::runOnModule (this=0x55cae6a59f10, M=...)
    at ../lib/IR/LegacyPassManager.cpp:1541
#19 0x000055cae2d767d1 in (anonymous namespace)::MPPassManager::runOnModule (M=..., 
    this=<optimized out>) at ../lib/IR/LegacyPassManager.cpp:1597
#20 llvm::legacy::PassManagerImpl::run (this=0x55cae62a58c0, M=...)
    at ../lib/IR/LegacyPassManager.cpp:1700
#21 0x000055cae258b7d0 in compileModule (argv=<optimized out>, Context=...)
    at ../tools/llc/llc.cpp:569
#22 0x000055cae254c85c in main (argc=<optimized out>, argv=0x7fff8e579368)
    at ../tools/llc/llc.cpp:346
mov version
#0  llvm::SDNode::SDNode (this=this@entry=0x55bdbcfa5c30, Opc=Opc@entry=186, Order=4, 
    dl=dl@entry=..., VTs=...) at ../include/llvm/CodeGen/SelectionDAGNodes.h:989
#1  0x000055bdb9717d6f in llvm::MemSDNode::MemSDNode (this=this@entry=0x55bdbcfa5c30, 
    Opc=Opc@entry=186, Order=<optimized out>, dl=..., VTs=..., memvt=..., mmo=0x55bdbd086bb0)
    at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7683
#2  0x000055bdb9721b6f in llvm::LSBaseSDNode::LSBaseSDNode (MMO=<optimized out>, MemVT=..., AM=8, 
    VTs=..., dl=..., Order=<optimized out>, NodeTy=llvm::ISD::LOAD, this=0x55bdbcfa5c30)
    at ../include/llvm/CodeGen/SelectionDAGNodes.h:1945
#3  llvm::LoadSDNode::LoadSDNode (MMO=0x55bdbd086bb0, MemVT=..., ETy=<optimized out>, AM=8, 
    VTs=..., dl=..., Order=<optimized out>, this=0x55bdbcfa5c30)
    at ../include/llvm/CodeGen/SelectionDAGNodes.h:1979
#4  llvm::SelectionDAG::newSDNode<llvm::LoadSDNode, unsigned int, llvm::DebugLoc const&, llvm::SDVTList&, llvm::ISD::MemIndexedMode&, llvm::ISD::LoadExtType&, llvm::EVT&, llvm::MachineMemOperand*&> (
    this=0x55bdbc074400) at ../include/llvm/CodeGen/SelectionDAG.h:319
#5  llvm::SelectionDAG::getLoad (this=this@entry=0x55bdbc074400, AM=AM@entry=llvm::ISD::UNINDEXED, 
    ExtType=<optimized out>, ExtType@entry=llvm::ISD::NON_EXTLOAD, VT=..., dl=..., Chain=..., 
    Ptr=..., Offset=..., MemVT=..., MMO=0x55bdbd086bb0)
    at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5941
#6  0x000055bdb9721ff3 in llvm::SelectionDAG::getLoad (this=this@entry=0x55bdbc074400, 
    AM=AM@entry=llvm::ISD::UNINDEXED, ExtType=ExtType@entry=llvm::ISD::NON_EXTLOAD, VT=..., 
    dl=..., Chain=..., Ptr=..., Offset=..., PtrInfo=..., MemVT=..., Alignment=8, MMOFlags=17, 
    AAInfo=..., Ranges=0x0) at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5899
#7  0x000055bdb9728752 in llvm::SelectionDAG::getLoad (this=0x55bdbc074400, VT=..., dl=..., 
    Chain=..., Ptr=..., PtrInfo=..., Alignment=8, MMOFlags=17, AAInfo=..., Ranges=0x0)
    at ../lib/CodeGen/SelectionDAG/SelectionDAG.cpp:5958
#8  0x000055bdb96616e1 in (anonymous namespace)::DAGCombiner::ReplaceExtractVectorEltOfLoadWithNarrowedLoad (this=this@entry=0x7ffe58777920, EVE=EVE@entry=0x55bdbccf0368, InVecVT=..., EltNo=..., 
    OriginalLoad=0x55bdbccf0230) at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14172
#9  0x000055bdb966221e in (anonymous namespace)::DAGCombiner::visitEXTRACT_VECTOR_ELT (
    this=0x7ffe58777920, N=0x55bdbccf0368) at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14405
#10 0x000055bdb966f444 in (anonymous namespace)::DAGCombiner::visit (
    this=this@entry=0x7ffe58777920, N=N@entry=0x55bdbccf0368)
    at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:1601
#11 0x000055bdb9670e8c in (anonymous namespace)::DAGCombiner::combine (
    this=this@entry=0x7ffe58777920, N=N@entry=0x55bdbccf0368)
    at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:1619
#12 0x000055bdb96727b7 in (anonymous namespace)::DAGCombiner::Run (this=this@entry=0x7ffe58777920, 
    AtLevel=AtLevel@entry=llvm::AfterLegalizeDAG)
    at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:1466
#13 0x000055bdb9674d65 in llvm::SelectionDAG::Combine (this=<optimized out>, 
    Level=Level@entry=llvm::AfterLegalizeDAG, AA=<optimized out>, OptLevel=<optimized out>)
    at ../lib/CodeGen/SelectionDAG/DAGCombiner.cpp:17752
#14 0x000055bdb974f89f in llvm::SelectionDAGISel::CodeGenAndEmitDAG (
    this=this@entry=0x55bdbc0599a0) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:849
#15 0x000055bdb975085a in llvm::SelectionDAGISel::SelectBasicBlock (
    this=this@entry=0x55bdbc0599a0, Begin=..., Begin@entry=..., End=..., End@entry=..., 
    HadTailCall=@0x7ffe58778460: false) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:664
#16 0x000055bdb975a40f in llvm::SelectionDAGISel::SelectAllBasicBlocks (
    this=this@entry=0x55bdbc0599a0, Fn=...)
    at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#17 0x000055bdb975c2f2 in llvm::SelectionDAGISel::runOnMachineFunction (this=<optimized out>, 
    mf=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#18 0x000055bdb975daac in llvm::SelectionDAGISel::runOnMachineFunction (this=<optimized out>, 
    mf=...) at ../lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:628
#19 0x000055bdb8baec74 in (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction (
    this=<optimized out>, MF=...) at ../lib/Target/X86/X86ISelDAGToDAG.cpp:175
#20 0x000055bdb9002b35 in llvm::MachineFunctionPass::runOnFunction (this=0x55bdbc0599a0, F=...)
    at ../lib/CodeGen/MachineFunctionPass.cpp:62
#21 0x000055bdb9320fd0 in llvm::FPPassManager::runOnFunction (this=0x55bdbcaaa330, F=...)
    at ../lib/IR/LegacyPassManager.cpp:1520
#22 0x000055bdb9321039 in llvm::FPPassManager::runOnModule (this=0x55bdbcaaa330, M=...)
    at ../lib/IR/LegacyPassManager.cpp:1541
#23 0x000055bdb93207d1 in (anonymous namespace)::MPPassManager::runOnModule (M=..., 
    this=<optimized out>) at ../lib/IR/LegacyPassManager.cpp:1597
#24 llvm::legacy::PassManagerImpl::run (this=0x55bdbc258f30, M=...)
    at ../lib/IR/LegacyPassManager.cpp:1700
#25 0x000055bdb8b357d0 in compileModule (argv=<optimized out>, Context=...)
    at ../tools/llc/llc.cpp:569
#26 0x000055bdb8af685c in main (argc=<optimized out>, argv=0x7ffe58779228)
    at ../tools/llc/llc.cpp:346

Maybe the DAG traversal depends on addresses?

@glandium
Copy link
Contributor Author

glandium commented Jul 5, 2018

Actually working with ir dumps before and after every pass says the difference comes from the "Expand ISel Pseudo-instructions" pass.

@glandium
Copy link
Contributor Author

glandium commented Jul 5, 2018

So it turns out the first difference I'm seeing in -print-before-all -print-after-all output is in "IR Dump Before Expand ISel Pseudo-instructions". The last(adjacent) non-different dump is "IR Dump After Module Verifier". That one is still llvm-ir, but the dump before "Expand ISel Pseudo-instructions" is pseudo asm.

@glandium
Copy link
Contributor Author

glandium commented Jul 6, 2018

-print-isel-input is identical, and -print-machine-instrs shows the difference for _ZN3std9panicking20rust_panic_with_hook17hc482345e1eead25bE "After Instruction Selection". So in fact, no new information since #52044 (comment)

@glandium
Copy link
Contributor Author

glandium commented Jul 6, 2018

I modified llc to not execute a program when doing -view-isel-dags or -view-legalize-dags (because that launches 2811 xdot processes), instead just dumping the dag dot files, and compared them. There didn't seem to be notable differences besides things like register numbers, which also differ between runs that are identical. At least there doesn't seem to be structural differences in the DAGs.

@glandium
Copy link
Contributor Author

glandium commented Jul 6, 2018

In case this helps:

Basic block containing the first -print-machine-instrs difference, in "After Instruction Selection":

Variant 1
%bb.118: derived from LLVM BB %555
    Predecessors according to CFG: %bb.117 
        %390:vr128 = MOVAPSrm %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 + 16, %fs; mem:LD16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)(align=32)+16](align=16)(alias.scope=!15687,!15689)(noalias=!15692,!15693,!15694,!15695,!15677,!15679,!15646)(dereferenceable) VR128:%390 dbg:/checkout/src/libcore/ptr.rs:221 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ] 
        %391:vr128 = V_SET0; VR128:%391 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        MOVAPSmr %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 + 16, %fs, killed %391; mem:ST16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)+16](noalias=!15695,!15677,!15679,!15646) VR128:%391 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        %392:gr32 = MOV32ri64 1; GR32:%392
        %393:gr64 = SUBREG_TO_REG 0, killed %392, sub_32bit; GR64:%393 GR32:%392
        %394:vr128 = MOV64toPQIrr killed %393; VR128:%394 GR64:%393 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        %395:gr64 = MOV64rm %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381, %fs; mem:LD8[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)](alias.scope=!15687,!15689)(noalias=!15692,!15693,!15694,!15695,!15677,!15679,!15646)(dereferenceable) GR64:%395 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        MOVAPSmr %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381, %fs, killed %394; mem:ST16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)](align=32)(noalias=!15695,!15677,!15679,!15646) VR128:%394 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        %60:gr64 = MOVPQIto64rr %390; GR64:%60 VR128:%390 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        %396:vr128 = PSHUFDri %390, 78; VR128:%396,%390 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        %61:gr64 = MOVPQIto64rr killed %396; GR64:%61 VR128:%396 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        TEST64rr %395, %395, implicit-def %eflags; GR64:%395 dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        JE_1 %bb.121, implicit %eflags; dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] 
        JMP_1 %bb.147; dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
    Successors according to CFG: %bb.121(0x20000000 / 0x80000000 = 25.00%) %bb.147(0x60000000 / 0x80000000 = 75.00%)
Variant 2
%bb.118: derived from LLVM BB %555 
    Predecessors according to CFG: %bb.117 
        %390:vr128 = MOVAPSrm %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 + 16, %fs; mem:LD16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)(align=32)+16](align=16)(alias.scope=!15687,!15689)(noalias=!15692,!15693,!15694,!15695,!15677,!15679,!15646)(dereferenceable) VR128:%390 dbg:/checkout/src/libcore/ptr.rs:221 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ] 
        %391:vr128 = V_SET0; VR128:%391 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        MOVAPSmr %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 + 16, %fs, killed %391; mem:ST16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)+16](noalias=!15695,!15677,!15679,!15646) VR128:%391 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        %392:gr32 = MOV32ri64 1; GR32:%392
        %393:gr64 = SUBREG_TO_REG 0, killed %392, sub_32bit; GR64:%393 GR32:%392
        %394:vr128 = MOV64toPQIrr killed %393; VR128:%394 GR64:%393 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        CMP64mi8 %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381, %fs, 0, implicit-def %eflags; mem:LD8[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)](alias.scope=!15687,!15689)(noalias=!15692,!15693,!15694,!15695,!15677,!15679,!15646)(dereferenceable) dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        MOVAPSmr %noreg, 1, %noreg, target-flags(x86-tpoff) @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381, %fs, killed %394; mem:ST16[bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*)](align=32)(noalias=!15695,!15677,!15679,!15646) VR128:%394 dbg:/checkout/src/libcore/ptr.rs:222 @[ /checkout/src/libcore/ptr.rs:187 @[ /checkout/src/libcore/mem.rs:636 @[ /checkout/src/libcore/mem.rs:694 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ] ] ] ]
        %60:gr64 = MOVPQIto64rr %390; GR64:%60 VR128:%390 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        %395:vr128 = PSHUFDri %390, 78; VR128:%395,%390 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        %61:gr64 = MOVPQIto64rr killed %395; GR64:%61 VR128:%395 dbg:/checkout/src/libcore/mem.rs:695 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        JE_1 %bb.121, implicit %eflags; dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
        JMP_1 %bb.147; dbg:/checkout/src/libcore/ptr.rs:59 @[ libstd/thread/local.rs:270 @[ libstd/thread/local.rs:296 @[ libstd/thread/local.rs:248 @[ libstd/panicking.rs:223 @[ libstd/panicking.rs:511 ] ] ] ] ]
    Successors according to CFG: %bb.121(0x20000000 / 0x80000000 = 25.00%) %bb.147(0x60000000 / 0x80000000 = 75.00%)

I think the matching basic block in the -print-isel-input IR is:

; <label>:555:                                    ; preds = %552
  %556 = load <4 x i64>, <4 x i64>* bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*), align 32, !dbg !15680, !alias.scope !15686, !noalias !15691
  store <4 x i64> <i64 1, i64 0, i64 0, i64 undef>, <4 x i64>* bitcast (<{ [40 x i8] }>* @_ZN3std9panicking12LOCAL_STDERR7__getit5__KEY17h5e656a3d08607c39E.llvm.17348687890923447381 to <4 x i64>*), align 32, !dbg !15697, !noalias !15698
  %557 = extractelement <4 x i64> %556, i32 0, !dbg !15699
  %558 = extractelement <4 x i64> %556, i32 2, !dbg !15699
  %559 = extractelement <4 x i64> %556, i32 3, !dbg !15699
  %560 = icmp eq i64 %557, 0, !dbg !15700
  %561 = icmp eq i64 %558, 0, !dbg !15702
  %562 = or i1 %560, %561, !dbg !15700
  br i1 %562, label %573, label %563, !dbg !15700

@glandium
Copy link
Contributor Author

This seems to have been fixed by #51966

@glandium
Copy link
Contributor Author

Bisection on the llvm side says this was fixed by... the additions for retpoline, which makes no sense, as it's not supposed to change anything if retpoline is not enabled. And as crazy as it seems, the simple fact that there's an extra PreEmitPass2 that doesn't run makes it disappear O_O.

@glandium
Copy link
Contributor Author

Even better, the retpoline patch has already been backported to llvm 6 used by rust...

@glandium
Copy link
Contributor Author

So... this might, in fact, not be fixed. That is, if I build beta locally, it's not happening. But if I use the released beta, it does.

@Mark-Simulacrum
Copy link
Member

The patch that you listed as closing this (the LLVM 7 upgrade) wasn't backported to beta, so I'm unclear why you'd expect it to be fixed by it on beta. Perhaps I'm misinterpreting something?

@glandium
Copy link
Contributor Author

What I'm saying is that the build environment in which rustc is built makes the problem appear or not, for the same version of rustc. There hasn't been a nightly produced since llvm 7 landed, so I did the build locally, and I marked this issue as fixed because it didn't happen on that build. But since a local build of beta with llvm 6 is also not affected, while the official beta is, that means master with llvm 7 is maybe only not affected on my end because of the compiler I'm using, and the official rustc might still be affected.

@Mark-Simulacrum
Copy link
Member

I also can't reproduce with the CI build of 64f7de9 (latest master commit as of now) so it's possible that the upgrade did in fact fix this, if not fully.

@glandium
Copy link
Contributor Author

I can reproduce on a locally built beta if I use clang 6 (which is apparently what rustc automation is using) instead of gcc 5.4 (which is what's installed in the VM I'm using). And it does, indeed, not happen with master built with clang 6.

@glandium
Copy link
Contributor Author

This is driving me crazy. Here's how things go in my VM:

  • rustc with llvm 6, built with clang 6.0 -> happens

  • rustc with llvm 6, built with gcc 5.4 -> doesn't happen

  • llc from llvm 6, built with clang 6.0 -> doesn't happen !

  • llc from llvm 6, built with gcc 5.4 -> happens !

  • rustc with llvm 7, built with clang 6.0 -> doesn't happen

  • rustc with llvm 7, built with gcc 6.0 -> doesn't happen

  • llc from llvm 7, built with clang 6.0 -> doesn't happen

  • llc from llvm 7, built with gcc 5.4 -> doesn't happen

At least it's more consistent with llvm 7, but considering it disappeared by adding a pass that does nothing (retpoline) while bisecting with gcc 5.4 doesn't make me confident that the actual problem is fixed, even if, in practice, it looks like it is.

@glandium
Copy link
Contributor Author

At least I can confirm that this does indeed not happen with the now latest nightly, with llvm 7.

So... ¯\_(ツ)_/¯, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants