-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Soundness hole in pattern matching on enums with an uninhabited variant #61696
Comments
cc @eddyb since potentially has something to do with enum optimizations? |
Minimized: #![feature(never_type, core_intrinsics)]
pub enum E1 {
V1 { f: bool },
V2 { f: ! },
V3,
V4,
}
fn main() {
match (E1::V1 { f: true }) {
E1::V2 { .. } => unsafe { ::core::intrinsics::unreachable() },
_ => {},
}
} MIR: // WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn main() -> () {
let mut _0: (); // return place in scope 0 at src/main.rs:10:11: 10:11
let mut _1: E1; // in scope 0 at src/main.rs:11:11: 11:31
let mut _2: isize; // in scope 0 at src/main.rs:12:9: 12:22
scope 1 {
}
bb0: {
StorageLive(_1); // bb0[0]: scope 0 at src/main.rs:11:11: 11:31
((_1 as V1).0: bool) = const true; // bb0[1]: scope 0 at src/main.rs:11:11: 11:31
// ty::Const
// + ty: bool
// + val: Scalar(0x01)
// mir::Constant
// + span: src/main.rs:11:24: 11:28
// + ty: bool
// + literal: Const { ty: bool, val: Scalar(0x01) }
discriminant(_1) = 0; // bb0[2]: scope 0 at src/main.rs:11:11: 11:31
_2 = discriminant(_1); // bb0[3]: scope 0 at src/main.rs:12:9: 12:22
switchInt(move _2) -> [1isize: bb1, otherwise: bb2]; // bb0[4]: scope 0 at src/main.rs:12:9: 12:22
}
bb1: {
const std::intrinsics::unreachable(); // bb1[0]: scope 1 at src/main.rs:12:35: 12:68
// ty::Const
// + ty: unsafe extern "rust-intrinsic" fn() -> ! {std::intrinsics::unreachable}
// + val: Scalar(<ZST>)
// mir::Constant
// + span: src/main.rs:12:35: 12:66
// + ty: unsafe extern "rust-intrinsic" fn() -> ! {std::intrinsics::unreachable}
// + literal: Const { ty: unsafe extern "rust-intrinsic" fn() -> ! {std::intrinsics::unreachable}, val: Scalar(<ZST>) }
}
bb2: {
StorageDead(_1); // bb2[0]: scope 0 at src/main.rs:15:1: 15:2
return; // bb2[1]: scope 0 at src/main.rs:15:2: 15:2
}
} Debug ASM (does not reproduce with release profile):
```asm
std::rt::lang_start: # @std::rt::lang_start
# %bb.0:
subq $56, %rsp
leaq .L__unnamed_1(%rip), %rax
movq %rdi, 24(%rsp)
movq %rsi, 32(%rsp)
movq %rdx, 40(%rsp)
movq 24(%rsp), %rdx
movq %rdx, 48(%rsp)
leaq 48(%rsp), %rdx
movq 32(%rsp), %rsi
movq 40(%rsp), %rcx
movq %rdx, %rdi
movq %rsi, 16(%rsp) # 8-byte Spill
movq %rax, %rsi
movq 16(%rsp), %rdx # 8-byte Reload
callq *std::rt::lang_start_internal@GOTPCREL(%rip)
movq %rax, 8(%rsp) # 8-byte Spill
# %bb.1:
movq 8(%rsp), %rax # 8-byte Reload
addq $56, %rsp
retq
# -- End function
std::rt::lang_start::{{closure}}: # @"std::rt::lang_start::{{closure}}" %bb.0:
%bb.1:
%bb.2:
std::sys::unix::process::process_common::ExitCode::as_i32: # @std::sys::unix::process::process_common::ExitCode::as_i32 %bb.0:
core::ops::function::FnOnce::call_once{{vtable.shim}}: # @"core::ops::function::FnOnce::call_once{{vtable.shim}}" %bb.0:
%bb.1:
core::ops::function::FnOnce::call_once: # @core::ops::function::FnOnce::call_once %bb.0:
.LBB4_1: .LBB4_2: .LBB4_3: .LBB4_4: core::ptr::real_drop_in_place: # @core::ptr::real_drop_in_place %bb.0:
<() as std::process::Termination>::report: # @"<() as std::process::Termination>::report" %bb.0:
%bb.1:
<std::process::ExitCode as std::process::Termination>::report: # @"<std::process::ExitCode as std::process::Termination>::report" %bb.0:
%bb.1:
playground::main: # @playground::main %bb.0:
%bb.1:
.LBB8_2: main: # @main %bb.0:
.L__unnamed_1: rustc_debug_gdb_scripts_section:
|
Regression between 1.23.0 and 1.24.0 |
This seems like a rather serious soundness hole that I would suggest treating as P-high. cc @rust-lang/compiler |
Given #![feature(never_type, core_intrinsics)]
extern crate core;
pub enum E1 {
V1 { f: bool },
V2 { f: ! },
V3,
V4,
}
fn main() {
match (E1::V1 { f: true }) {
E1::V2 { .. } => unsafe { ::core::intrinsics::unreachable() },
_ => {},
}
} LLVM for the interesting code is: ; test::main
; Function Attrs: nonlazybind uwtable
define internal void @_ZN4test4main17h4c216eb98fecdd36E() unnamed_addr #0 {
start:
%_1 = alloca i8, align 1
store i8 1, i8* %_1, align 1
%0 = load i8, i8* %_1, align 1, !range !3 ; %0 = 1
%1 = sub i8 %0, 0 ; %1 = 1
%2 = icmp ule i8 %1, 3 ; %2 = 1 <= 3 = 1
%3 = zext i8 %1 to i64 ; %3 = 1
%4 = select i1 %2, i64 %3, i64 0 ; %4 = 0
%5 = icmp eq i64 %4, 1 ; %5 = 0
br i1 %5, label %bb1, label %bb2 ; -> bb1
bb1: ; preds = %start
unreachable
bb2: ; preds = %start
ret void
}
!3 = !{i8 0, i8 4} |
This is probably not A-LLVM because the generated LLVM IR is already wrong. |
This comment has been minimized.
This comment has been minimized.
MIR still looks okay to me, so this seems like a MIR -> LLVM-IR translation bug. |
This comment has been minimized.
This comment has been minimized.
Much better, thanks!
EDIT: yeah the subtraction should be with rust/src/librustc_codegen_ssa/mir/place.rs Lines 265 to 266 in 1cbd8a4
|
"More minimized" without match and unstable features:
Interesting enough: When one removes the
to
( |
triage: P-high. Leaving nominated and unassigned for now. |
I compared what miri does (which doesn't fail) against what codegen does. The relevant difference is rust/src/librustc_mir/interpret/operand.rs Line 652 in 38cd948
vs rust/src/librustc_codegen_ssa/mir/place.rs Line 269 in 38cd948
where the latter (codegen) does not check whether Other than that, both code paths are equivalent
I'd guess that's just a result from other niche optimizations kicking in for the enum without |
I think I've figured out what's wrong here, why this wasn't triggered more often, and why What we have is an
I was confused because initially I thought The reason this hasn't caused many problems in the past is that Currently, the check is: discr - niche_start + niche_variants.start() <= niche_variants.end() But the correct one-comparison trick, which I meant to use, would be: discr - niche_start <= niche_variants.end() - niche_variants.start() That is, you have to rely on rebasing the lowest value in the range to What miri does is less efficient, but that's fine, as it's an interpreter. For codegen we want something simple. |
What Miri does is correct though? Including considering both |
@RalfJung And it would've worked, too, if I hadn't combined a different operation into it, ruining the trick. Oh and miri should probably be using |
I tried, it doesn't really make things nicer and I am not even sure if moving the assertions around is correct: RalfJung@279adf8. Probably it's not. The EDIT: Also that arithmetic is done at type |
@RalfJung Hmm, the arithmetic probably needs to be done on the size of the discriminant, like here: rust/src/librustc_target/abi/mod.rs Lines 664 to 669 in bdd4bda
Double-casting assertions can, and probably should, be replaced nowadays by |
I am not so convinced... seems like the This avoids casting too early, but it's not really any safer than what we do right now: RalfJung@f311c4e84d |
triage: assigning to @eddyb |
I was going to argue that the bug is older, but 1.23 was early 2018, and the bug was introduced late 2017, as the code in question has been wrong in the same way since #45225. |
noted that issue appears under control at T-compiler meeting, and deliberately skipped discussion of it. Removing nomination label. |
rustc_codegen_ssa: fix range check in codegen_get_discr. Fixes #61696, see #61696 (comment) for more details. In short, I had wanted to use `x - a <= b - a` to check whether `x` is in `a..=b` (as it's 1 comparison instead of 2 *and* `b - a` is guaranteed to fit in the same data type, while `b` itself might not), but I ended up with `x - a + c <= b - a + c` instead, because `x - a + c` was the final value needed. That latter comparison is equivalent to checking that `x` is in `(a - c)..=b`, i.e. it also includes `(a - c)..a`, not just `a..=b`, so if `c` is not `0`, it will cause false positives. This presented itself as the non-niche ("dataful") variant sometimes being treated like a niche variant, in the presence of uninhabited variants (which made `c`, aka the index of the first niche variant, arbitrarily large). r? @nagisa, @rkruppe or @oli-obk
Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7a8a62d7a736d6bd54273422da29d7af
Click to see the example code from above link
Expected output:
success
Actual output:
Running the above code on both Rust stable and nightly on my machine returns
illegal hardware instruction (core dumped)
with some number different on each run prefixing the output.Some experimentations:
C
solves this issue.A1
andA2
solves this issue.B
other thanB5
works.I assume this is a bug in the compiler?
The text was updated successfully, but these errors were encountered: