Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit noundef LLVM attribute #74378

Open
nikic opened this issue Jul 15, 2020 · 6 comments
Open

Emit noundef LLVM attribute #74378

nikic opened this issue Jul 15, 2020 · 6 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nikic
Copy link
Contributor

nikic commented Jul 15, 2020

LLVM 11 introduces a new noundef attribute, with the following semantics:

This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation.

In LLVM 11 itself it doesn't do anything yet, but this will become important in the future to reduce the impact of freeze instructions.

We need to figure out for which parameters / return values we can emit this attribute. We generally can't do so if any bits are unspecified, e.g. due to padding. More problematic for Rust is rust-lang/unsafe-code-guidelines#71, i.e. the question of whether integers are allowed to contain uninitialized bits without going through something like MaybeUninit.

If we go with aggressive emission of noundef, we probably need to punish safe-guard mem::uninitialized() users with liberal application of freeze.

cc @RalfJung

@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 15, 2020
@RalfJung
Copy link
Member

We generally can't do so if any bits are unspecified, e.g. due to padding.

The bits you quote explicitly say

Note that this does not refer to padding introduced by the type’s storage representation.

So it seems to me types with padding are fine?

We should definitely be able to already add this for types like bool or char, for references, and for structs consisting only of such types.

Cc @rust-lang/wg-unsafe-code-guidelines

If we go with aggressive emission of noundef, we probably need to punish safe-guard mem::uninitialized() users with liberal application of freeze.

To be clear, adding noundef there would not be wrong (depending on the type used), we just might want to take it easy because there is a lot of broken code out there using mem::uninitialized the wrong way.

@hanna-kruppe
Copy link
Contributor

This attribute only refers to values, not memory pointed to by pointer arguments, right? I don't think there are many cases where we pass any aggregates by value in LLVM IR, except:

  • ScalarPair types are returned as anonymous 2-member structs, e.g. fn foo() -> (u8, u32) will return {i8, i32} at LLVM level. This should be covered by the quoted exemption for padding introduced by the type.
  • On some targets, an ABI lowering like "pass this big aggregate by value, on the stack" uses argument type [N x iM] (for M, N such that the size matches). I am pretty sure we can't use noundef in those cases (if the original type has padding) since it's not visible to LLVM that some bytes of this aggregate are padding at Rust level.

@nikic
Copy link
Contributor Author

nikic commented Jul 16, 2020

We generally can't do so if any bits are unspecified, e.g. due to padding.

The bits you quote explicitly say

Note that this does not refer to padding introduced by the type’s storage representation.

So it seems to me types with padding are fine?

Sorry for being unclear on this point. Whether padding is fine depends on how the value is passed. E.g. if we have a boolean and pass it as i1, then using noundef is fine. If we have a boolean and pass it as i8 with the top seven bits unspecified, then using noundef is not possible. The same applies to aggregates -- whether we pass it as an actual aggregate, or reinterpreted as a type that makes padding accessible.

This attribute only refers to values, not memory pointed to by pointer arguments, right?

That's correct. On a pointer type noundef determines whether the pointer itself may be undef/poison, not the pointed-to memory.

@bjorn3
Copy link
Member

bjorn3 commented Feb 12, 2022

Booleans are always passed as i8 with zero extension rather than bit casting, so all bits are specified. This applies both for arguments and memory accesses. Only local variables can be i1.

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 13, 2022
Apply noundef attribute to &T, &mut T, Box<T>, bool

This doesn't handle `char` because it's a bit awkward to distinguish it from `u32` at this point in codegen.

Note that this _does not_ change whether or not it is UB for `&`, `&mut`, or `Box` to point to undef. It only applies to the pointer itself, not the pointed-to memory.

Fixes (partially) rust-lang#74378.

r? `@nikic` cc `@RalfJung`
@nikomatsakis
Copy link
Contributor

When generating LLVM types from Rust types, do we ever emit padding explicitly or something like that? I could imagine us "taking charge" of some aspects of enum lowering in this way.

@bjorn3
Copy link
Member

bjorn3 commented Feb 14, 2022

When generating LLVM types from Rust types, do we ever emit padding explicitly or something like that?

Yes. https://rust.godbolt.org/z/xd4er1hTY

#[repr(C)]
pub struct Foo {
    a: u8,
    // 1 byte padding
    b: u16,
    // 4 bytes padding
    c: u64,
}

pub fn foo(_: Foo) {}
%Foo = type { i8, [1 x i8], i16, [2 x i16], i64 }
[...]

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 27, 2022
Apply noundef attribute to all scalar types which do not permit raw init

Beyond `&`/`&mut`/`Box`, this covers `char`, enum discriminants, `NonZero*`, etc.
All such types currently cause a Miri error if left uninitialized,
and an `invalid_value` lint in cases like `mem::uninitialized::<char>()`.

Note that this _does not_ change whether or not it is UB for `u64` (or
other integer types with no invalid values) to be undef.

Fixes (partially) rust-lang#74378.

r? `@ghost` (blocked on rust-lang#94127)

`@rustbot` label S-blocked
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 28, 2022
…ikic

Apply noundef metadata to loads of types that do not permit raw init

This matches the noundef attributes we apply on arguments/return types.

Fixes (partially) rust-lang#74378.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants