Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe fields #3458

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions text/0000-unsafe-fields.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
- Feature Name: `unsafe_fields`
Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The appears to be missing the https://github.com/rust-lang/rfcs/blob/master/0000-template.md#rationale-and-alternatives section. Please add one, with a bunch of subsections for various "here's why I picked this way over some other way" decisions you made.

For example, that's a great place to talk about unsafe structs vs unsafe fields.

- Start Date: 2023-07-13
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary

Fields may be declared `unsafe`. Unsafe fields may only be mutated (excluding interior mutability)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why exclude interior mutability? That should be explained in the RFC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's addressed in the linked RFC that explains what is considered a mutable use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that RFC shows why specifically for unsafe fields we should ignore interior mutability. If I have an invariant involving a Cell<i32>, surely calling set on it should require an unsafe block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If interior mutability were included,vit would require that any reference (as in immutable reference) to a !Freeze type be considered unsafe, as it is impossible to know from that alone whether there is a mutation occurring. Surely that isn't desirable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If interior mutability were included,vit would require that any reference (as in immutable reference) to a !Freeze type be considered unsafe

Indeed that is what I would expect. It seems less undesirable than having field.set(...) violate an invariant.

Copy link

@RustyYato RustyYato Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!Freeze isn't sufficient because Box<Cell<T>>: Freeze. This would require creating any reference to unsafe fields to be considered unsafe. Only a move/copy would be safe.

(Or a new trait to describe Immutable types. Which would make adding interior immutability a breaking change)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!Freeze isn't sufficient because Box<Cell>: Freeze. This would require creating any reference to unsafe fields to be considered unsafe. Only a move/copy would be safe.

Ah, good point. Not entirely sure how I feel about that.

Still at the very least the RFC should discuss interior mutability and mention the alternative of also attempting to prevent mutation of interior mutable unsafe fields.

(It already is a breaking change, as interior mutability matters for consts, statics, implicit promotion.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it's not that big of a deal to rip the text from the previous RFC, adapting slightly as appropriate. I can elaborate on interior mutability specifically.

or initialized in an unsafe context. Reading the value of an unsafe field may occur in either safe
or unsafe contexts. An unsafe field may be relied upon as a safety invariant in other unsafe code.

# Motivation

Emphasis on safety is a key strength of Rust. A major point here is that any code path that can
result in undefined behavior must be explicitly marked with the `unsafe` keyword. However, the
current system is insufficient. While Rust provides the `unsafe` keyword at the function level,
there is currently no mechanism to mark fields as `unsafe`.

For a real-world example, consider the `Vec` type in the standard library. It has a `len` field that
is used to store the number of elements present. Setting this field is exposed publicly in the
`Vec::set_len` method, which has safety requirements:

- `new_len` must be less than or equal to `capacity()`.
- The elements at `old_len..new_len` must be initialized.

This field is safe to read, but unsafe to mutate or initialize due to the invariants. These
invariants cannot be expressed in the type system, so they must be enforced manually. Failure to do
so may result in undefined behavior elsewhere in `Vec`.

By introducing unsafe fields, Rust can improve the situation where a field that is otherwise safe is
used as a safety invariant.

# Guide-level explanation

Fields may be declared `unsafe`. Unsafe fields may only be initialized or accessed mutably in an
unsafe context. Reading the value of an unsafe field may occur in either safe or unsafe contexts. An
unsafe field may be relied upon as a safety invariant in other unsafe code.

Here is an example to illustrate usage:

```rust
struct Foo {
safe_field: u32,
/// Safety: Value must be an odd number.
unsafe unsafe_field: u32,
}

// Unsafe field initialization requires an `unsafe` block.
// Safety: `unsafe_field` is odd.
let mut foo = unsafe {
Copy link

@juntyr juntyr Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure I like that the entire struct expression is now inside an unsafe block (though I’m not sure what a better syntax would be). If the safety invariant requires the entire struct, this makes sense. However, if it is more specific, a larger unsafe block is too broad as it also allows unsafe code usage to initialise the safe fields, which should get their own unsafe blocks. Though perhaps this could just be linted against, e.g. don’t use unsafe expressions in a struct initialiser without putting them inside nested unsafe blocks.

Copy link
Member Author

@jhpratt jhpratt Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to do this:

let mut foo = Foo {
    safe_field: 0,
    unsafe_field: unsafe { 1 },
};

That feels worse to me.

Presumably the safety invariant being promised is an invariant within the struct, even if it is not strictly required that that be the case.

Ideally we'd also have partial initialization, such that it would be possible to do

let mut foo = Foo { safe_field: 0 };
unsafe { foo.unsafe_field = 1; }

but I expect that's quite a bit farther away.

Copy link

@juntyr juntyr Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that just unsafe in one field initialiser expression is insufficient as it means something very different. I do like however that it clearly shows which field the unsafety applies to.

It reminds me a bit of unsafe blocks in unsafe functions (#2585). Ideally, the fact that struct init is unsafe would not allow unsafe field init expressions. I doubt that special-casing struct inits to not propagate outer unsafe blocks would be backwards compatible, so a new lint would be the only avenue in this direction.

Some new syntax like this

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe unsafe_field: 1,
};

would communicate intent better but looks quite unnatural to me.

Perhaps in the future there might be explicit safe blocks to reassert a safe context within an unsafe block, so that it could be written as follows:

let mut foo = unsafe {
    Foo {
        safe_field: safe { /* some more complex expr here */ },
        unsafe_field: safe { /* some more complex expr here */ },
    }
};

which could be encouraged with clippy lints. Though for now just moving safe non-trivial struct field initialisers into variables that are initialised outside the unsafe block.

Overall, I think going with the original syntax of

let mut foo = unsafe {
    Foo {
        safe_field: 2,
        unsafe_field: 1,
    }
};

looks like a good and intuitive solution, though I still think that a lint against using unsafe expressions inside the struct expression without a nested unsafe block would still help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this?

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe_field: 1,
};

unsafe would be followed immediately by a struct expression, and doing so would only indicate that unsafe fields may be initialized. It would not introduce an unsafe context.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it - the syntax is still close enough to the unsafe block syntax (except when initialising tuple structs) that it's relatively intuitive what the unsafety applies to, but coupled strongly to the struct name so that it also makes sense why no unsafe context is introduced.

Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's something nice about

let mut foo = unsafe Foo {
    safe_field: 0,
    unsafe_field: 1,
};

but to me that would go with unsafe struct Foo, like how unsafe trait Bar leads to unsafe impl Bar.

Spitballing: if per-field safety is really needed, then maybe

let mut foo = Foo {
    safe_field: 0,
    unsafe unsafe_field: 1,
};

though maybe the answer is the same as for unsafe { unsafe_fn_call(complicated_expr) }: if you don't want the expr in the block, use a let.

After all, there's always field shorthand available, so you can do

let safe_field = unsafe { complex to construct };
let unsafe_field = easy and safe to construct;
unsafe { Foo { safe_field, unsafe_field } }

Perhaps the more important factor would be the workflow for the errors the programmer gets when changing a (internal and thus not semver break) field to unsafe. Anything at the struct level wouldn't give new "hey, you need unsafe here" if you already had another field that was unsafe.

Of course, if the safety is at the struct level (not the field level) then that doesn't come up.


Hmm, the "you added an additional requirement to something that's already in an unsafe block and there's no way to help you make sure you handed that" is a pre-existing problem that needs tooling for that everywhere, so maybe it's not something this RFC needs to think about.

If we had unsafe(aligned(a), nonzero(b)) { ... } so that tooling could help ensure that people thought about everything declared in the safety section, then we'd just have unsafe(initialized(ptr, len)) { ... } so that it acknowledged the discharge of the obligation for the type invariant, and the "I need 100 unsafe blocks" problem goes away.

Foo {
safe_field: 0,
unsafe_field: 1,
}
};

// Safe field: no unsafe block.
foo.safe_field = 1;

// Unsafe field with mutation: unsafe block is required.
// Safety: The value is odd.
unsafe { foo.unsafe_field = 3; }

// Unsafe field without mutation: no unsafe block.
println!("{}", foo.unsafe_field);
```

For a full description of where a mutable access is considered to have occurred (and why), see
[RFC 3323]. Keep in mind that due to reborrowing, a mutable access of an unsafe field is not
necessarily explicit.

[RFC 3323]: https://rust-lang.github.io/rfcs/3323-restrictions.html#where-does-a-mutation-occur

```rust
fn change_unsafe_field(foo: &mut Foo) {
// Safety: An odd number plus two remains an odd number.
unsafe { foo.unsafe_field += 2; }
Copy link
Member

@scottmcm scottmcm Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: include a section talking about how both of the following are reasonable types:

/// This type has a *correctness* invariant that it holds an even number,
/// but it's not something on which `unsafe` code is allowed to rely.
pub struct Even(u32);

impl Even {
    pub fn new(x: u32) -> Option<Self>;
    pub fn new_i_promise(x: u32) -> Self;
    pub fn get_value(&self) -> u32;
}
/// # Safety
///
/// The value of this type must always be an even number.
/// (`unsafe` code is allowed to rely on that fact.)
pub unsafe struct Even(u32);

impl Even {
    pub fn new(x: u32) -> Option<Self>;
    pub unsafe fn new_unchecked(x: u32) -> Self;
    pub fn get_value(&self) -> u32;
}

and it's up to the type author to decide which is appropriate for the expected uses of the type.

}
```

# Reference-level explanation

## Syntax

Using the syntax from [the reference for structs][struct syntax], the change needed to support
unsafe fields is minimal.

[struct syntax]: https://doc.rust-lang.org/stable/reference/items/structs.html#structs

```diff
StructField :
OuterAttribute*
Visibility?
+ unsafe?
IDENTIFIER : Type

TupleField :
OuterAttribute*
Visibility?
+ unsafe?
Type
```

## Behavior

An unsafe field may only be mutated or initialized in an unsafe context. Failure to do so is a compile error.

## "Mutable use" in the compiler

The concept of a "mutable use" [already exists][mutating use] within the compiler. This catches all
situations that are relevant here, including `ptr::addr_of_mut!`, `&mut`, and direct assignment to a
Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could argue that ptr::addr_of_mut! on an unsafe field need not be unsafe, because writes through the pointer are unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this in conjunction with the examples present in the RFC.

fn make_zero(p: *mut u8) {
    unsafe { *p = 0; }
}

let p = ptr::addr_of_mut!(foo.unsafe_field);
make_zero(p);

Ignoring thread-safety, which can be easily achieved with locks, no single step appears wrong. make_zero does not do anything wrong — assigning zero to an arbitrary *mut u8 is fine. Passing a pointer to the method is naturally okay. Yet it still results in unsoundness, as foo.unsafe_field must be odd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"assigning zero to an arbitrary *mut u8 is fine" what‽ No it is not fine‽ An arbitrary *mut u8 could be null, dangling, aliased... fn make_zero is unsound.

Copy link
Member Author

@jhpratt jhpratt Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True — I typed that far too quickly and without thinking. Regardless, it's not immediately obvious to me that ptr::addr_of_mut! should be allowed safely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully convinced either way, but I fear it would just be confusing to require unsafe for an operation that can't lead to unsoundness, especially as addr_of!(struct.field) as *mut _ would do the same thing with no unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

especially as addr_of!(struct.field) as *mut _ would do the same thing

While you can do that, it's undefined behavior to actually mutate the resulting mut pointer. I've just confirmed this with miri.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's UB under Stacked Borrows, but MIRIFLAGS=-Zmiri-tree-borrows accepts it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's surprising. I'm not familiar with tree borrows, admittedly.

Copy link

@CodesInChaos CodesInChaos Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptr::from_mut(&mut struct).wrapping_offset(offset_of!(Struct, field)) should achieve the same thing as addr_of_mut!, and I doubt you'd want to make offset_of unsafe for unsafe fields. And even without offset_of, you could use ptr::from_mut(&mut struct).wrapping_offset(addr_of!(stuct.Field) - ptr::from_ref(&struct)). Both of these should be safe in every borrowing model.

field, while excluding interior mutability. As such, formal semantics of what constitutes a "mutable
use" are not stated here.

[mutating use]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/visit/enum.PlaceContext.html#method.is_mutating_use

# Drawbacks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add something about the proliferation of unsafe use cases, since a lot of programs that currently #![deny(unsafe)] without exception will likely need to dial it back.

This isn't necessarily a bad thing, but it does seem like it starts uplifting unsafe to a more general-user usecase (as opposed to mostly users reasoning about hardware interaction).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a crate uses no unsafe, then it has no need for unsafe fields.

Copy link
Contributor

@tgross35 tgross35 Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal is guided toward more broadly preventing logic errors rather than strictly what is now known as UB via unsafe, no? One of the examples linked above was keeping flags in sync with fields for a time parser, which I don't think was tied to any existing unsafe calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the examples linked above was keeping flags in sync with fields for a time parser

The flags field in that example serves as a bitpacked equivalent of the discriminant for Option. Those flags are used to determine whether it's safe to read a MaybeUninit value. It's 100% tied to memory safety.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected on the flags case - but the RFC example is using these fields to ensure that a value is odd with no other unsafe interaction. If non-UB cases aren't a target of the RFC, could you update the example to show something where not upholding unsafe field invariants would lead to UB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. The example of an even number isn't the best, as it's contrived. There's an unstated assumption that it's being relied upon as a safety invariant somewhere else in the program. I can definitely improve that example.


- Additional syntax for macros to handle
- More syntax to learn

# Prior art

Some items in the Rust standard library have `#[rustc_layout_scalar_valid_range_start]`,
`#[rustc_layout_scalar_valid_range_end]`, or both. These items have identical behavior to that of
unsafe fields described here. It is likely (though not required by this RFC) that these items will
be required to use unsafe fields, which would reduce special-casing of the standard library.

# Unresolved questions

- If the syntax for restrictions does not change, what is the ordering of keywords on a field that
is both unsafe and mut-restricted?
- Are there any interactions or edge cases with other language features that need to be considered?

# Future possibilities

??
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsafe fields could give the compiler a signal that is useful for tons of things -- safe transmute, for instance. But we should also at least consider whether e.g. deriving auto traits like Send on structs with unsafe fields is a good idea -- due to the unsafe invariants, there really is no way that the compiler can tell whether the type is Send or not (and the same goes for variance).