-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwinding through fn()
references across crates causes the behavior to depend on the optimization level
#29485
Comments
Looks like a codegen bug related to the %X = type { i8, i8 }
declare void @foo()
define void @f(%X* noalias) {
%x = alloca %X*
store %X* %0, %X** %x
%2 = load %X*, %X** %x
%3 = getelementptr inbounds %X, %X* %2, i32 0, i32 0
store i8 1, i8* %3
call void @foo()
%4 = load %X*, %X** %x
%5 = getelementptr inbounds %X, %X* %4, i32 0, i32 0
store i8 0, i8* %5
ret void
} When optimized, we get: %X = type { i8, i8 }
declare void @foo()
define void @f(%X* noalias nocapture) {
%2 = getelementptr inbounds %X, %X* %0, i64 0, i32 0
tail call void @foo()
store i8 0, i8* %2, align 1
ret void
} And as we can see, the store of 1 has been optimized away. If we instead remove the %X = type { i8, i8 }
declare void @foo()
define void @f(%X* nocapture) {
%2 = getelementptr inbounds %X, %X* %0, i64 0, i32 0
store i8 1, i8* %2, align 1
tail call void @foo()
store i8 0, i8* %2, align 1
ret void
} cc @dotdash, perhaps we can't place |
I'm wrong. |
An LLVM IR There's a decent argument that LLVM's treatment of #include <iostream>
using namespace std;
void xx() {
throw 1;
}
void (*volatile g)() = xx;
void f(int* __restrict x) {
*x = 1;
g();
*x = 0;
}
int main() {
int x = 0;
try { f(&x); } catch(...) {}
std::cout << x;
return 0;
} gcc prints 1; clang prints 0. On the other hand, you're likely to run into resistance proposing any restriction on how alias analysis can reason about noalias pointers. |
This isn't just a
MemoryDependenceAnalysis in LLVM currently says that the second store here has a direct local dependence on the first store. There's a sense in which that is a correct answer, but DeadStoreElimination inteprets this answer to mean that the first store is dead. It's not immediately obvious which of these two should be fixed.
|
triage: P-high |
Is the problem here that LLVM is optimizing too heavily, or that we are providing incorrect annotations? Either way, it may be prudent to supply fewer annotations for the time being. That'd certainly be an easy patch to do. |
I asked sunfish to clarify on IRC, and he believes this is 100% an llvm bug. |
I've now filed an LLVM bug: https://llvm.org/bugs/show_bug.cgi?id=25422 |
We were discussing in the compiler team meeting. Do we expect LLVM to fix this in a short time frame? I would think probably not -- we were thinking therefore that it would be prudent to workaround in the time being. Would this mean removing the aliasing annotations and thus taking some speed hit? Are there other solutions? Other implications? |
No; fixing DSE is probably easy enough... but trusting noalias to work correctly alongside unwinding would probably require auditing every use of alias analysis in LLVM (for example, I'm pretty sure LICM is also broken). In any case, no matter how fast this is fixed, Rust can't unconditionally depend on any fix because we support linking against 3.7.
Yes.
No.
I don't think so. |
LICM is not obviously broken; see eg. isGuaranteedToExecute in LICM.cpp; it explicitly checks for instructions which may unwind before doing related things. The bug here does appear to arise from the difference between MemDep's way of viewing the world and DSE's. I don't know how conservative Rust wants to be here, but there is room for optimism that this may just be an isolated DSE bug (or a MemDep bug which only affects DSE). |
Won't removing |
Okay, LICM's is in fact correct... the definition of isGuaranteedToExecute is overly conservative in a way which happens to cover up this issue. That said, my non-comprehensive audit shows -memcpyopt, -mldst-motion, and -loop-idiom are broken. (Granted, -mldst-motion isn't part of the default pass pipeline.) A couple more testcases:
|
Wow. I revoke my optimism then. |
I am curious what effect it will have, I don't really have a good feeling On Thu, Nov 12, 2015 at 7:39 PM, Dan Gohman notifications@github.com
|
This looks rather unambiguously like an LLVM bug (though we may want to work around it). |
(a reasonable subtask for a relative newcomer to the project could be to evaluate the actual performance impact removing the |
For rust itself, the text segment in the |
Could we have concrete numbers ( |
Of course, 1-3% perf losses still suck - this might be a reason to make official rustc builds use |
@arielb1 I removed all diff --git a/src/librustc_trans/trans/attributes.rs b/src/librustc_trans/trans/attributes.rs
index 28dfa4e..32dd31d 100644
--- a/src/librustc_trans/trans/attributes.rs
+++ b/src/librustc_trans/trans/attributes.rs
@@ -10,7 +10,7 @@
//! Set and unset common attributes on LLVM values.
use libc::{c_uint, c_ulonglong};
-use llvm::{self, ValueRef, AttrHelper};
+use llvm::{self, ValueRef};
use middle::ty;
use middle::infer;
use session::config::NoDebugInfo;
@@ -117,7 +117,6 @@ pub fn from_fn_attrs(ccx: &CrateContext, attrs: &[ast::Attribute], llfn: ValueRe
llvm::ColdAttribute as u64)
}
} else if attr.check_name("allocator") {
- llvm::Attribute::NoAlias.apply_llfn(llvm::ReturnIndex as c_uint, llfn);
} else if attr.check_name("unwind") {
unwind(llfn, true);
}
@@ -190,24 +189,12 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// invisible to the program. We also know it's nonnull as well
// as how many bytes we can dereference
attrs.arg(1, llvm::Attribute::StructRet)
- .arg(1, llvm::Attribute::NoAlias)
.arg(1, llvm::Attribute::NoCapture)
.arg(1, llvm::DereferenceableAttribute(llret_sz));
// Add one more since there's an outptr
idx += 1;
} else {
- // The `noalias` attribute on the return value is useful to a
- // function ptr caller.
- match ret_ty.sty {
- // `Box` pointer return values never alias because ownership
- // is transferred
- ty::TyBox(it) if common::type_is_sized(ccx.tcx(), it) => {
- attrs.ret(llvm::Attribute::NoAlias);
- }
- _ => {}
- }
-
// We can also mark the return value as `dereferenceable` in certain cases
match ret_ty.sty {
// These are not really pointers but pairs, (pointer, len)
@@ -233,8 +220,7 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// For non-immediate arguments the callee gets its own copy of
// the value on the stack, so there are no aliases. It's also
// program-invisible so can't possibly capture
- attrs.arg(idx, llvm::Attribute::NoAlias)
- .arg(idx, llvm::Attribute::NoCapture)
+ attrs.arg(idx, llvm::Attribute::NoCapture)
.arg(idx, llvm::DereferenceableAttribute(llarg_sz));
}
@@ -244,8 +230,6 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// `Box` pointer parameters never alias because ownership is transferred
ty::TyBox(inner) => {
- attrs.arg(idx, llvm::Attribute::NoAlias);
-
if common::type_is_sized(ccx.tcx(), inner) {
let llsz = machine::llsize_of_real(ccx, type_of::type_of(ccx, inner));
attrs.arg(idx, llvm::DereferenceableAttribute(llsz));
@@ -265,10 +249,6 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// on memory dependencies rather than pointer equality
let interior_unsafe = mt.ty.type_contents(ccx.tcx()).interior_unsafe();
- if mt.mutbl == hir::MutMutable || !interior_unsafe {
- attrs.arg(idx, llvm::Attribute::NoAlias);
- }
-
if mt.mutbl == hir::MutImmutable && !interior_unsafe {
attrs.arg(idx, llvm::Attribute::ReadOnly);
}
diff --git a/src/librustc_trans/trans/foreign.rs b/src/librustc_trans/trans/foreign.rs
index 95e9e85..39858e9 100644
--- a/src/librustc_trans/trans/foreign.rs
+++ b/src/librustc_trans/trans/foreign.rs
@@ -363,8 +363,7 @@ pub fn trans_native_call<'blk, 'tcx>(bcx: Block<'blk, 'tcx>,
// The outptr can be noalias and nocapture because it's entirely
// invisible to the program. We also know it's nonnull as well
// as how many bytes we can dereference
- attrs.arg(1, llvm::Attribute::NoAlias)
- .arg(1, llvm::Attribute::NoCapture)
+ attrs.arg(1, llvm::Attribute::NoCapture)
.arg(1, llvm::DereferenceableAttribute(llret_sz));
}; Sizes before:
Sizes after:
Ignoring the I'll gather new timings in the evening. It'll probably be output from |
More numbers. I went with the lilbcore
libstd
libsyntax
|
@dotdash great, thanks! This is sort of inline with my expectations. I also expect we'll see more impact on smaller examples. |
As an aside, at the work week I got a lot of great suggestions for benchmarks, I have to try and draw them up together into a central list now. |
Assigning to myself for the purpose of gathering up a few benchmarks. |
Doesn't the above patch remove way more |
So I ran the benchmarks from https://github.com/cristicbz/rust-doom/tree/rustc-benchmark master:
Patched:
|
That is interesting. Looks like about a 5% slowdown. |
This patch also removes NoAlias for " Maybe it'd be interesting to see how big the impact of just disabling NoAlias for shared borrows is? EDIT: oops obviously I mixed up shared and mutable references, |
Time for some new numbers. This time only removing the diff --git a/src/librustc_trans/trans/attributes.rs b/src/librustc_trans/trans/attributes.rs
index 28dfa4e..95c3941 100644
--- a/src/librustc_trans/trans/attributes.rs
+++ b/src/librustc_trans/trans/attributes.rs
@@ -265,7 +265,7 @@ pub fn from_fn_type<'a, 'tcx>(ccx: &CrateContext<'a, 'tcx>, fn_type: ty::Ty<'tcx
// on memory dependencies rather than pointer equality
let interior_unsafe = mt.ty.type_contents(ccx.tcx()).interior_unsafe();
- if mt.mutbl == hir::MutMutable || !interior_unsafe {
+ if mt.mutbl != hir::MutMutable && !interior_unsafe {
attrs.arg(idx, llvm::Attribute::NoAlias);
} Current master (
Patched master:
So basically no difference in performance. I'll make an PR for this. |
Thanks for persevering, @dotdash! |
On Thu, Feb 04, 2016 at 02:48:48PM -0800, Björn Steinbrink wrote:
Great. |
LLVM's memory dependence analysis doesn't properly account for calls that could unwind and thus effectively act as a branching point. This can lead to stores that are only visible when the call unwinds being removed, possibly leading to calls to drop() functions with b0rked memory contents. As there is no fix for this in LLVM yet and we want to keep compatibility to current LLVM versions anyways, we have to workaround this bug by omitting the noalias attribute on &mut function arguments. Benchmarks suggest that the performance loss by this change is very small. Thanks to @RalfJung for pushing me towards not removing too many noalias annotations and @alexcrichton for helping out with the test for this bug. Fixes rust-lang#29485
LLVM's memory dependence analysis doesn't properly account for calls that could unwind and thus effectively act as a branching point. This can lead to stores that are only visible when the call unwinds being removed, possibly leading to calls to drop() functions with b0rked memory contents. As there is no fix for this in LLVM yet and we want to keep compatibility to current LLVM versions anyways, we have to workaround this bug by omitting the noalias attribute on &mut function arguments. Benchmarks suggest that the performance loss by this change is very small. Thanks to @RalfJung for pushing me towards not removing too many noalias annotations and @alexcrichton for helping out with the test for this bug. Fixes #29485
#32155 could be a regression from this change. It's not that particular code regresses all the time, but it makes the optimization flaky, and it depends on the surrounding program. |
In
a.rs
In
b.rs
NB: Some pre-1.0 nightlies already behaved this way.
The text was updated successfully, but these errors were encountered: