-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EarlyOtherwiseBranch transformation is incorrect #78496
Comments
Fix issue rust-lang#78496 EarlyOtherwiseBranch finds MIR structures like: ``` bb0: { ... _2 = discriminant(X) ... switchInt(_2) -> [1_isize: bb1, otherwise: bb3] } bb1: { ... _3 = discriminant(Y) ... switchInt(_3) -> [1_isize: bb2, otherwise: bb3] } bb2: {...} bb3: {...} ``` And transforms them into something like: ``` bb0: { ... _2 = discriminant(X) _3 = discriminant(Y) _4 = Eq(_2, _3) switchInt(_4) -> [true: bb4, otherwise: bb3] } bb2: {...} // unchanged bb3: {...} // unchanged bb4: { switchInt(_2) -> [1_isize: bb2, otherwise: bb3] } ``` But that is not always a safe thing to do -- sometimes the early `otherwise` branch is necessary so the later block could assume the value of `discriminant(X)`. I am not totally sure what's the best way to detect that, but fixing rust-lang#78496 should be easy -- we just check if `X` is a sub-expression of `Y`. A more precise test might be to check if `Y` contains a `Downcast(1)` of `X`, but I think this might be good enough. Fix rust-lang#78496
Fix issue rust-lang#78496 EarlyOtherwiseBranch finds MIR structures like: ``` bb0: { ... _2 = discriminant(X) ... switchInt(_2) -> [1_isize: bb1, otherwise: bb3] } bb1: { ... _3 = discriminant(Y) ... switchInt(_3) -> [1_isize: bb2, otherwise: bb3] } bb2: {...} bb3: {...} ``` And transforms them into something like: ``` bb0: { ... _2 = discriminant(X) _3 = discriminant(Y) _4 = Eq(_2, _3) switchInt(_4) -> [true: bb4, otherwise: bb3] } bb2: {...} // unchanged bb3: {...} // unchanged bb4: { switchInt(_2) -> [1_isize: bb2, otherwise: bb3] } ``` But that is not always a safe thing to do -- sometimes the early `otherwise` branch is necessary so the later block could assume the value of `discriminant(X)`. I am not totally sure what's the best way to detect that, but fixing rust-lang#78496 should be easy -- we just check if `X` is a sub-expression of `Y`. A more precise test might be to check if `Y` contains a `Downcast(1)` of `X`, but I think this might be good enough. Fix rust-lang#78496
Fix issue rust-lang#78496 EarlyOtherwiseBranch finds MIR structures like: ``` bb0: { ... _2 = discriminant(X) ... switchInt(_2) -> [1_isize: bb1, otherwise: bb3] } bb1: { ... _3 = discriminant(Y) ... switchInt(_3) -> [1_isize: bb2, otherwise: bb3] } bb2: {...} bb3: {...} ``` And transforms them into something like: ``` bb0: { ... _2 = discriminant(X) _3 = discriminant(Y) _4 = Eq(_2, _3) switchInt(_4) -> [true: bb4, otherwise: bb3] } bb2: {...} // unchanged bb3: {...} // unchanged bb4: { switchInt(_2) -> [1_isize: bb2, otherwise: bb3] } ``` But that is not always a safe thing to do -- sometimes the early `otherwise` branch is necessary so the later block could assume the value of `discriminant(X)`. I am not totally sure what's the best way to detect that, but fixing rust-lang#78496 should be easy -- we just check if `X` is a sub-expression of `Y`. A more precise test might be to check if `Y` contains a `Downcast(1)` of `X`, but I think this might be good enough. Fix rust-lang#78496
Rollup of 11 pull requests Successful merges: - rust-lang#79051 (Implement if-let match guards) - rust-lang#79877 (Allow `since="TBD"` for rustc_deprecated) - rust-lang#79882 (Fix issue rust-lang#78496) - rust-lang#80026 (expand-yaml-anchors: Make the output directory separator-insensitive) - rust-lang#80039 (Remove unused `TyEncoder::tcx` required method) - rust-lang#80069 (Test that `core::assert!` is valid) - rust-lang#80072 (Fixed conflict with drop elaboration and coverage) - rust-lang#80073 (Add support for target aliases) - rust-lang#80082 (Revert rust-lang#78790 - rust-src vendoring) - rust-lang#80097 (Add `popcount` and `popcnt` as doc aliases for `count_ones` methods.) - rust-lang#80103 (Remove docs for non-existent parameters in `rustc_expand`) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
While this one particular test case was fixed, the transformation still suffers from issues hinted at earlier comment:
cc @oli-obk |
@tmiasko ,
EarlyOtherwiseBranch only applies to basic blocks of which the last statement is a discriminant (and the terminator is a switch): fn find_switch_discriminant_info(
&self,
bb: &BasicBlockData<'tcx>,
switch: &Terminator<'tcx>,
) -> Option<SwitchDiscriminantInfo<'tcx>> {
...
let place_of_adt_discr_read = match bb.statements.last()?.kind {
StatementKind::Assign(box (_, Rvalue::Discriminant(adt_place))) => {
Some(adt_place)
}
_ => None,
}?; And,
We could require the discriminant read in the second basic block to be the only statement within the block. Not sure by how much would the change affect optimization hit rate, but at least the example in #68867 would still be optimized. What makes me really concerned is this one:
When would the first discriminant be safe but not the second? The example you constructed is a legit one: the second read assumes the value of the first read and does a downcast; that is the case my patch catches. The only other possible invalid case I could think of, is that when the second discriminant call's argument is an invalid pointer. Probably something like: fn f(x: &Option<i32>, y: &Option<i32>) -> i32 {
if let &Some(_) = x {
if let &Some(_) = y {
return 1
}
}
42
}
f(&None, /* invalid ref obtained through unsafe? */) But today the MIR that FE emits dodges the problem by accident:
Because these two switch instructions have different So, my guess is, even though EarlyOtherwiseBranch is unsound, today the compiler FE is still unable to produce MIR that could trigger that. |
If one considers shape of MIR immediately after it has been built, that is probably true. One thing to keep mind: this transformation runs on MIR that have been transformed already. In fact, the whole optimization pipeline might have run an arbitrary number of times as a result of inlining. Transformations shouldn't rely on any invariants that aren't preserved by all transformations.
The |
Ah! I misunderstood your point I think we could resolve your
But I found that the same effect of EarlyOtherwiseBranch could be achieved through a much easier change; we could provide a hint for LLVM optimizer to compare the discriminant values before entering the switch, e.g.: enum E {
A(i32),
B(i32),
C(i32),
D(i32),
}
#[no_mangle]
fn add1_match(x: &E, y: &E) -> i32 {
if (mem::discriminant(x) != mem::discriminant(y)) {
return 99992;
}
match (x,y) {
(&E::A(x), &E::A(y)) => x+y,
(&E::B(x), &E::B(y)) => x+y,
(&E::C(x), &E::C(y)) => x+y,
(&E::D(x), &E::D(y)) => x+y,
_ => 99992,
}
} With this hint, .section .text.add1_match,"ax",@progbits
.globl add1_match
.p2align 4, 0x90
.type add1_match,@function
add1_match:
.cfi_startproc
movl (%rdi), %ecx
movl $99992, %eax
cmpl (%rsi), %ecx
jne .LBB7_3
movl %ecx, %eax
leaq .LJTI7_0(%rip), %rcx
movslq (%rcx,%rax,4), %rax
addq %rcx, %rax
jmpq *%rax
.LBB7_2:
movl 4(%rsi), %eax
addl 4(%rdi), %eax
.LBB7_3:
retq
.Lfunc_end7:
.size add1_match, .Lfunc_end7-add1_match
.cfi_endproc
.section .rodata.add1_match,"a",@progbits
.p2align 2
.LJTI7_0:
.long .LBB7_2-.LJTI7_0
.long .LBB7_2-.LJTI7_0
.long .LBB7_2-.LJTI7_0
.long .LBB7_2-.LJTI7_0 IMO this is much more elegant than EarlyOtherwiseBranch; it should also be pretty easy to implement, so I will give it a try. |
…und, r=oli-obk Mark early otherwise optimization unsound r? `@oli-obk` cc `@tmiasko` Related to rust-lang#78496 and rust-lang#82905 Should I also bump this one to level 3 or 4 or given that is unsound it doesn't matter?. Probably need to adjust some tests.
For details see #77163 (comment), opening an issue since PR was closed:
The text was updated successfully, but these errors were encountered: