Emit noundef LLVM attribute #74378

nikic · 2020-07-15T19:03:04Z

LLVM 11 introduces a new noundef attribute, with the following semantics:

This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation.

In LLVM 11 itself it doesn't do anything yet, but this will become important in the future to reduce the impact of freeze instructions.

We need to figure out for which parameters / return values we can emit this attribute. We generally can't do so if any bits are unspecified, e.g. due to padding. More problematic for Rust is rust-lang/unsafe-code-guidelines#71, i.e. the question of whether integers are allowed to contain uninitialized bits without going through something like MaybeUninit.

If we go with aggressive emission of noundef, we probably need to ~~punish~~ safe-guard mem::uninitialized() users with liberal application of freeze.

cc @RalfJung

The text was updated successfully, but these errors were encountered:

RalfJung · 2020-07-16T09:35:58Z

We generally can't do so if any bits are unspecified, e.g. due to padding.

The bits you quote explicitly say

Note that this does not refer to padding introduced by the type’s storage representation.

So it seems to me types with padding are fine?

We should definitely be able to already add this for types like bool or char, for references, and for structs consisting only of such types.

Cc @rust-lang/wg-unsafe-code-guidelines

If we go with aggressive emission of noundef, we probably need to punish safe-guard mem::uninitialized() users with liberal application of freeze.

To be clear, adding noundef there would not be wrong (depending on the type used), we just might want to take it easy because there is a lot of broken code out there using mem::uninitialized the wrong way.

hanna-kruppe · 2020-07-16T11:51:06Z

This attribute only refers to values, not memory pointed to by pointer arguments, right? I don't think there are many cases where we pass any aggregates by value in LLVM IR, except:

ScalarPair types are returned as anonymous 2-member structs, e.g. fn foo() -> (u8, u32) will return {i8, i32} at LLVM level. This should be covered by the quoted exemption for padding introduced by the type.
On some targets, an ABI lowering like "pass this big aggregate by value, on the stack" uses argument type [N x iM] (for M, N such that the size matches). I am pretty sure we can't use noundef in those cases (if the original type has padding) since it's not visible to LLVM that some bytes of this aggregate are padding at Rust level.

nikic · 2020-07-16T12:01:49Z

We generally can't do so if any bits are unspecified, e.g. due to padding.

The bits you quote explicitly say

Note that this does not refer to padding introduced by the type’s storage representation.

So it seems to me types with padding are fine?

Sorry for being unclear on this point. Whether padding is fine depends on how the value is passed. E.g. if we have a boolean and pass it as i1, then using noundef is fine. If we have a boolean and pass it as i8 with the top seven bits unspecified, then using noundef is not possible. The same applies to aggregates -- whether we pass it as an actual aggregate, or reinterpreted as a type that makes padding accessible.

This attribute only refers to values, not memory pointed to by pointer arguments, right?

That's correct. On a pointer type noundef determines whether the pointer itself may be undef/poison, not the pointed-to memory.

bjorn3 · 2022-02-12T18:27:11Z

Booleans are always passed as i8 with zero extension rather than bit casting, so all bits are specified. This applies both for arguments and memory accesses. Only local variables can be i1.

Apply noundef attribute to &T, &mut T, Box<T>, bool This doesn't handle `char` because it's a bit awkward to distinguish it from `u32` at this point in codegen. Note that this _does not_ change whether or not it is UB for `&`, `&mut`, or `Box` to point to undef. It only applies to the pointer itself, not the pointed-to memory. Fixes (partially) rust-lang#74378. r? `@nikic` cc `@RalfJung`

nikomatsakis · 2022-02-14T15:00:58Z

When generating LLVM types from Rust types, do we ever emit padding explicitly or something like that? I could imagine us "taking charge" of some aspects of enum lowering in this way.

bjorn3 · 2022-02-14T16:04:09Z

When generating LLVM types from Rust types, do we ever emit padding explicitly or something like that?

Yes. https://rust.godbolt.org/z/xd4er1hTY

#[repr(C)]
pub struct Foo {
    a: u8,
    // 1 byte padding
    b: u16,
    // 4 bytes padding
    c: u64,
}

pub fn foo(_: Foo) {}

%Foo = type { i8, [1 x i8], i16, [2 x i16], i64 }
[...]

Apply noundef attribute to all scalar types which do not permit raw init Beyond `&`/`&mut`/`Box`, this covers `char`, enum discriminants, `NonZero*`, etc. All such types currently cause a Miri error if left uninitialized, and an `invalid_value` lint in cases like `mem::uninitialized::<char>()`. Note that this _does not_ change whether or not it is UB for `u64` (or other integer types with no invalid values) to be undef. Fixes (partially) rust-lang#74378. r? `@ghost` (blocked on rust-lang#94127) `@rustbot` label S-blocked

…ikic Apply noundef metadata to loads of types that do not permit raw init This matches the noundef attributes we apply on arguments/return types. Fixes (partially) rust-lang#74378.

nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 15, 2020

RalfJung mentioned this issue Jul 16, 2020

Validity of integers and floating point rust-lang/unsafe-code-guidelines#71

Closed

erikdesjardins mentioned this issue Feb 5, 2022

Apply noundef attribute to &T, &mut T, Box<T>, bool #93670

Merged

This was referenced Feb 19, 2022

Apply noundef attribute to all scalar types which do not permit raw init #94157

Merged

Apply noundef metadata to loads of types that do not permit raw init #94158

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit noundef LLVM attribute #74378

Emit noundef LLVM attribute #74378

nikic commented Jul 15, 2020

RalfJung commented Jul 16, 2020

hanna-kruppe commented Jul 16, 2020

nikic commented Jul 16, 2020 •

edited

Loading

bjorn3 commented Feb 12, 2022

nikomatsakis commented Feb 14, 2022

bjorn3 commented Feb 14, 2022 •

edited

Loading

Emit noundef LLVM attribute #74378

Emit noundef LLVM attribute #74378

Comments

nikic commented Jul 15, 2020

RalfJung commented Jul 16, 2020

hanna-kruppe commented Jul 16, 2020

nikic commented Jul 16, 2020 • edited Loading

bjorn3 commented Feb 12, 2022

nikomatsakis commented Feb 14, 2022

bjorn3 commented Feb 14, 2022 • edited Loading

nikic commented Jul 16, 2020 •

edited

Loading

bjorn3 commented Feb 14, 2022 •

edited

Loading