Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for union field accesses in THIR unsafeck #85263

Merged
merged 3 commits into from
Jul 9, 2021

Conversation

syvb
Copy link
Contributor

@syvb syvb commented May 13, 2021

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 13, 2021
Copy link
Contributor

@LeSeulArtichaut LeSeulArtichaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation itself looks good to me (though, again, I'm not a reviewer).

I think we should add revisions for existing tests. If we do, we can mark the errors that we don't emit yet as

//[mirunsafeck]~ ERROR
// FIXME(thir-unsafeck)

Unsure though. Anyway I'd want to hear from @nikomatsakis

@LeSeulArtichaut
Copy link
Contributor

r? @nikomatsakis

if adt_def.is_union() {
self.requires_unsafe(expr.span, AccessToUnionField);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the actual check seems quite a bit more complicated! (cc @RalfJung, who authored those changes)

https://github.com/rust-lang/rust/blob/24379879acc0959e8ae0f3c19d249b3beb278836/compiler/rustc_mir/src/transform/check_unsafety.rs#L214-L266

I'm wondering why more tests didn't fail. I think we're going to have to tweak these rules here.

Copy link
Contributor

@LeSeulArtichaut LeSeulArtichaut May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, you need unsafe to read from the union. (Tbh I've never used unions in Rust 😄)

I'm wondering why more tests didn't fail.

I think we don't run the existing check-fail tests, see my comment above.
EDIT: nevermind, this should be too conservative instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All reads and some writes should be unsafe, yes -- and we should have tests covering that... tough probably not very precise ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the src/test/ui/union/union-unsafe.rs should help; you probably want to run that under thirunsafeck as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung I'm trying to properly check if something is reading from a union, and encountered this weirdness. Is this a bug or feature of the MIR borrow checker?

// this is safe
match foo {
    Foo { zst: () } => {},
}
// but this is so unsafe it prints the same error message twice?
match foo {
    Foo { pizza: Pizza { topping: Some(PizzaTopping::Cheese) | Some(PizzaTopping::Pineapple) | None } } => {},
    // ^~ ERROR access to union field is unsafe and requires unsafe function or block
}

playground

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that behavior in the first match is a bug in the MIR borrowck, since RFC 1444 says that pattern matching against unions must be unsafe. (and MIR borrowck is okay with it since it doesn't actually involve any reads)

Copy link
Member

@RalfJung RalfJung May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this has to do with how pattern matching is compiled -- matching zst against the () pattern does not actually read anything (we know the pattern matches without even looking). So it makes sense to me that the former would not be unsafe. But one could easily decide either way and both ways make sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this pattern is unsafe:

Foo { zst: x }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the checking, I think it should properly detect if something is a read or write to union now (I am also now running almost all of the src/test/ui/union test suite).

@syvb syvb force-pushed the thir-unsafeck-union-field branch 2 times, most recently from 7f424a7 to 14834a6 Compare May 14, 2021 13:01
@rust-log-analyzer

This comment has been minimized.

@syvb syvb force-pushed the thir-unsafeck-union-field branch from 14834a6 to 62bbb23 Compare May 14, 2021 13:43
@syvb
Copy link
Contributor Author

syvb commented May 14, 2021

The THIR checker now rejects code like this while the MIR checker doesn't. I'm pretty sure that this is a bug with the MIR checker though.

union Foo { bar: i8 }
let Foo { bar: _ } = Foo { bar: 5 };

union Foo2 { bar: () }
let Foo2 { bar: () } = Foo2 { bar: () };

None of those actually do anything though, so it doesn't seem like this would have any real-world implications.

(playground)

@RalfJung
Copy link
Member

RalfJung commented May 14, 2021

I'm on the fence about whether those should require unsafe or not -- as I said, given that matching against () is quite clearly not actually reading the union field, making this safe seems reasonable to me.

Making them unsafe would be a breaking change, so the question is also if that change is worth it. My inclination here is "no" -- either way is a good option, and so in the absence of a preference either way we should err on the side of not doing a breaking change.

What is the argument for why you consider this a bug in the MIR checker?

@syvb
Copy link
Contributor Author

syvb commented May 14, 2021

@RalfJung The MIR rules result in really odd behavior like this in safe code:

#[derive(Copy, Clone)]
#[repr(u8)]
enum OneVal {
    One = 1,
}

union Foo {
    bar: u8,
    oneval: OneVal
}

fn main() {
    match (Foo { bar: 42 }) {
        Foo { oneval: OneVal::One } => {
            println!("reached!"); // always prints
        },
    }
}

The value of Foo isn't 1 and doesn't represent OneVal::One when the first and only branch is reached, but the first arm always matches. (I'm not sure if this is UB) Even weirder, this code does require an unsafe block, even though x is being bound to a constant (OneVal::One):

match (Foo { bar: 42 }) {
    Foo { oneval: x @ OneVal::One } => { //~ ERROR access to union field is unsafe
        println!("reached!");
    },
}

And if we add another field to the enum suddenly even exhaustive matching against it using or patterns is unsafe:

#[derive(Copy, Clone)]
#[repr(u8)]
pub enum TwoVal {
    One = 1,
    Two = 2,
}

union Foo {
    bar: u8,
    oneval: TwoVal
}

fn main() {
    match (Foo { bar: 42 }) {
        Foo { oneval: TwoVal::One | TwoVal::Two } => { //~ ERROR access to union field is unsafe
            println!("reached!");
        },
    }
}

@LeSeulArtichaut
Copy link
Contributor

(This is a debate that looks somewhat similar to #80059)

@syvb
Copy link
Contributor Author

syvb commented May 14, 2021

That weird behavior is actually a regression. Rust 1.19 (first stable release with unions) rejects the first snippet while Rust 1.22 accepts it.

@RalfJung
Copy link
Member

I agree the OneVal behavior is surprising and possibly a bug, since OneVal has size 1.

But that does not imply that the () behavior is a bug. If you remove the repr(u8), the behavior you see is what I would expect.

@syvb
Copy link
Contributor Author

syvb commented May 14, 2021

I've loosened up the safety rules a little bit to allow binding union fields to wildcards:

#[derive(Copy, Clone)]
struct Pie {
    slices: u8,
    size: u8,
}

union Foo {
    bar: i8,
    baz: Pie
}

// safe with THIR and MIR checker:
match u {
    Foo { baz: _ } => {},
}

// unsafe to THIR, safe to MIR:
match u {
    Foo { baz: Pie { .. } } => {},
}
match u {
    Foo { baz: Pie { slices: _, size: _ } } => {},
}

@syvb syvb force-pushed the thir-unsafeck-union-field branch from 42b10ea to 129958e Compare May 14, 2021 22:36
@bors
Copy link
Contributor

bors commented May 16, 2021

☔ The latest upstream changes (presumably #85259) made this pull request unmergeable. Please resolve the merge conflicts.

@syvb syvb force-pushed the thir-unsafeck-union-field branch from 129958e to e6c63c8 Compare May 16, 2021 14:06
@syvb
Copy link
Contributor Author

syvb commented May 17, 2021

I've updated the way it checks unsafety to match the MIR rules as closely as possible. Specifically, the rule it enforces is "Irrefutable pattern matching against unions without any bindings is safe". This makes the THIR checker almost entirely match the MIR checker. The only difference I know of is

union Foo { bar: i8, pizza: Pizza }
#[derive(Copy, Clone)]
struct Pizza {
    topping: Option<PizzaTopping>
}
#[derive(Copy, Clone)]
enum PizzaTopping { Cheese, Pineapple }

let foo = Foo { bar: 5 };
match foo { Foo {
    pizza: Pizza {
        topping: Some(PizzaTopping::Cheese) | Some(PizzaTopping::Pineapple) | None
    }
} => {} };

Which the THIR checker accepts but MIR rejects. This seems to be a quirk of the way the MIR desugars nested or expressions.

@RalfJung
Copy link
Member

RalfJung commented May 17, 2021

"Irrefutable pattern matching against unions without any bindings is safe"

Uh... is that really what the old checker did? "Irrefutable" is a very subtle notion, in particular when considering unions.
The patterns we were talking about, like (), aren't just irrefutable -- they were literally NOPs without even considering the "set of possible values", since two zero-sized values are always equal! Anything that actually matches on a non-zero-sized enum should IMO definitely be unsafe.

@syvb
Copy link
Contributor Author

syvb commented May 17, 2021

@RalfJung As far as I can tell that's what's happening. Here's another example:

union Foo { bar: i8 }
let foo = Foo { bar: 5 };
match foo { Foo {
    bar: i8::MIN..=i8::MAX, // safe since this always matches
} => {} };

(playground)

Perhaps it would be better to change to rejecting patterns like this and only make an exception for ZSTs?

@RalfJung
Copy link
Member

RalfJung commented May 17, 2021

Hm, interesting.
I'm honestly not sure any more what would be better.^^ Looks like T-lang needs to make a decision here about which patterns on union fields should be considered safe and which need unsafe.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Jun 19, 2021
…rtichaut

Add pattern walking support to THIR walker

Suggested in rust-lang#85263 (comment), this splits off the support for pattern walking in THIR from rust-lang#85263. This has no observable effect on THIR unsafety checking, since it is not currently possible to trigger unsafety from the THIR checker using the additional patterns or constants that are now walked. THIR patterns are walked in source code order.

r? `@LeSeulArtichaut`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 19, 2021
…ichaut

Add pattern walking support to THIR walker

Suggested in rust-lang#85263 (comment), this splits off the support for pattern walking in THIR from rust-lang#85263. This has no observable effect on THIR unsafety checking, since it is not currently possible to trigger unsafety from the THIR checker using the additional patterns or constants that are now walked. THIR patterns are walked in source code order.

r? `@LeSeulArtichaut`
@bors
Copy link
Contributor

bors commented Jun 19, 2021

☔ The latest upstream changes (presumably #86378) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 8, 2021

r? @oli-obk

@rust-highfive rust-highfive assigned oli-obk and unassigned nikomatsakis Jul 8, 2021
Copy link
Contributor

@oli-obk oli-obk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found a small edge case that is probably an unreachable pattern.

The union "drop field" assignment unsafety can be solved in a follow up PR, they are marked with FIXMEs after all.

compiler/rustc_mir_build/src/check_unsafety.rs Outdated Show resolved Hide resolved
@syvb syvb force-pushed the thir-unsafeck-union-field branch from 7ab11d2 to df3e003 Compare July 9, 2021 17:51
@oli-obk
Copy link
Contributor

oli-obk commented Jul 9, 2021

@bors r+

@bors
Copy link
Contributor

bors commented Jul 9, 2021

📌 Commit b86ed4a has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Jul 9, 2021
@bors
Copy link
Contributor

bors commented Jul 9, 2021

⌛ Testing commit b86ed4a with merge 240ff4c...

@bors
Copy link
Contributor

bors commented Jul 9, 2021

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 240ff4c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 9, 2021
@bors bors merged commit 240ff4c into rust-lang:master Jul 9, 2021
@rustbot rustbot added this to the 1.55.0 milestone Jul 9, 2021
@LeSeulArtichaut LeSeulArtichaut removed I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jul 27, 2021
@LeSeulArtichaut
Copy link
Contributor

Moved the discussion about pattern matching on unions to #87520.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants