-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What about: use-after-move and (maybe) use-after-drop #188
Comments
I was actually betting on having a But that doesn't appear to be happening, so this would at most be a correctness check. As for reads becoming |
Oh, interesting. I suppose we could treat a
Not sure what you mean. The actual read is performed by methods in We'd have to preserve on an |
If we make it just act like a write for Stacked Borrows, but not actually change memory, that would work with
|
We could run a second visitor after the evaluation of a statement whose sole purpose is to find We can't use |
It would be hard to guarantee that evaluating the statement did not end up changing what those For partial moves, how would that ever have the effect of invalidating pointers? The neighboring fields have to stay in place after all! |
To mirror what I said on Zulip: I would like to see an example of an optimization that this enables. Since opening this issue, I've come around to realize that the examples where I thought it would help, are actually not valid. |
I moved this to the UCG repo because really this is a question of what the semantics should be, not how to implement it in Miri. |
Jim Blandy asked via email if similar things to what has been proposed for |
…r=m-ou-se Do not create dangling &T in Weak<T>::drop Since at this point all strong pointers have been dropped, the wrapped `T` has also been dropped. As such, creating a `&T` to the dropped place is negligent at best (language UB at worst). Since we have `Layout::for_value_raw` now, use that instead of `Layout::for_value` to avoid creating the `&T`. This does have implications for custom (potentially thin) DSTs, though much less severe than those discussed in rust-lang#80407. Specifically, one of two things has to be true: - It has to be possible to use a `*const T` to a dropped (potentially custom, potentially thin) unsized tailed object to determine the layout (size/align) of the object. This is what is currently implemented (though with `&T` instead of `&T`). The validity of reading some location after it has been dropped is an open question IIUC (rust-lang/unsafe-code-guidelines#188) (except when the whole type is `Copy`, per `drop_in_place`'s docs). In this design, custom DSTs would get a `*mut T` and use that to return layout, and must be able to do so while in the "zombie" (post-drop, pre-free) state. - `RcBox`/`ArcInner` compute and store layout eagerly, so that they don't have to ask the type for its layout after dropping it. Importantly, this is already true today, as you can construct `Rc<DST>`, create a `Weak<DST>`, and drop the `Rc` before the `Weak`. This PR is a strict improvement over the status quo, and the above question about potentially thin DSTs will need to be resolved by any custom DST proposal.
Regarding use-after-drop: as @CAD97 pointed out, |
See rust-lang/rust#71117 (comment) for a possible reason why we might need moves to deinit their source. |
On the other hand, it looks like we actually need to be able to read enum discriminants after partial moves out of an enum (rust-lang/rust#91029, rust-lang/rust#89764 (comment)). That is a problem if that move de-inited the part of the enum where the discriminant is stored -- can that happen with niche optimizations? |
An example would probably be Option<Vec>. Vec is a ptr/len/capacity but the ptr is always a non-null so that makes a niche where the |
Well, the motivation for 'de-init on move' came up in rust-lang/rust#71117 (comment), so the question is if that would ever be in conflict with something like |
Ah, i think that part of it is a little too theoretical for me. (though "some kind of aliasing rules" like you suggest at the end of your comment is often how I've imagined it to work) |
With regards to destructive move, that's the semantics of the surface language, so personally I would expect it to also happen at the MIR level. Is there a reason not to implement it, other than implementation-level difficulties? With regards to
which implies that the memory is left initialized, even for In fact, here is an example from |
Yes. It makes MIR a lot harder to analyze, since the order in which operands are evaluated suddenly matters. Also IMO this is the wrong question: when it comes to adding UB, we should always be asking whether there is a good reason to have this be UB, and make it not UB otherwise. |
Whether it is technically UB or not doesn't matter. The basic rules of Rust are that moves are destructive, so the old location must never be used again. Even if it's not UB, reading the moved-from place would still be a major logical error, because that moved object can have all kinds of external resources attached or reasons to be non-cloneable. I expect tools like Miri to help in finding double drops, regardless whether it is UB in the technical sense which could affect generated code.
That may be true, but since moves must be destructive, you should still encode the invalidation of old places in some way, If the question is "do we invalidate on move, or do we encode it is a separate operation", then I have no opinion which implementation strategy is better, but if the question is "do we keep moved-from places initialized and usable", then it violates the expected semantics of moves and drops. |
This repository, and Miri, are all about whether code is "technically" UB or not. That is literally the only question that matters here.
You are presupposing that we have established this as consensus. We have not. That's why this issue is open. |
Looks like this discussion mostly revolves around Not sure if it's helpful, but I have this weird idea that a drop (both |
@zeegomo I don't see why that means you need any amount of UB - using a value of an unknown type after it has been dropped is clearly unsound, there's nothing that making it UB gets you. |
Not sure what you mean by unknown type. Have I misunderstood something? |
The difference between library UB and language UB. Using a value after it's been dropped can obviously cause UB. But that's library UB, the same as using Library UB is when you've misused an Library UB doesn't mean that the program execution is necessarily undefined, though. As a simple example, consider that a possible implementation of any unsafe By saying that reading memory after it's been dropped is UB, you're saying that the example program of let p: *mut Box<i32> = Box::into_raw(Box::new(Box::new(0)));
drop_in_place(p);
let x: *mut i32 = *(p as *mut *mut i32);
dbg!(x); should be UB and that there is no incorrect compilation of this program. The semantically very similar let p: *mut Box<i32> = Box::into_raw(Box::new(Box::new(0)));
drop(ptr::copy(p));
let x: *mut i32 = *(p as *mut *mut i32);
dbg!(x); is unambiguously a defined program. The extra fun version is that by saying reading the memory is UB, that's an even stronger requirement than saying the memory is uninitialized. Abstract bytes now have a 258th state (still ignoring provenance) where the mere act of reading them even as Personally, my current leaning is that running drop glue on a place (via I'm also of the personal opinion that a move should be properly destructive and uninitialize the memory being moved from. (Probably in separate typed copy and uninitialize steps at the opsem level, but imo ideally in a way such that the two places are allowed to have the same address if they're both stack allocated objects.) |
I might have used the wrong terminology, but that's not what I meant to say. The destructor might leave some fixed byte string in place of the value, but that fixed value has no special meaning besides being a byte string (i.e. is the destructor allowed to leave the value in a state that violates type guarantees?). Using that value again as a value of any type T requires great care and possibly cause UB, as would |
This is the case for any unsafe function which can obtain a pointer to the allocation. It is not language UB to write |
I recently run into this being considered struct B();
impl !Send for B {}
fn do_sth(_: &B) {}
async fn test() {
let b = B();
do_sth(&b);
drop(b);
yield_now().await;
} which lead me to rust-lang/rust#94067. I saw a PR rust-lang/rust#110420 which was rejected because @JakobDegen points out that it's undecided whether use-after-move is allowed (this issue). In the case above, because |
I think there is a somewhat separate question here. Even if Note that this is not even primarily a question of motivation. Having storage disappear earlier is hard, I haven't even seen a proposal that actually achieves this. We'd need to emit suitable MIR statements indicating that struct B(i32);
fn main() {
let mut b = B(0);
let ptr = std::ptr::addr_of_mut!(b);
drop(b);
unsafe { ptr.write(B(1)); }
} |
That sounds to me like tightening up the way we handle |
Yeah that could be one way to tackle the |
I created a new issue specifically for "what is the semantics of |
Visiting for backlog bonanza. This may need deduplication with some other open issues, but leaving open as the canonical one |
Above and elsewhere, I said things like
I actually today realized that while this is true for locals, it is not true for places inside a mod other {
/// Some private memory to store stuff in.
static mut S: *mut i32 = 0 as *mut i32;
pub fn lib1(x: & &mut i32) { unsafe {
S = (x as *const &mut i32).cast::<*mut i32>().read();
} }
pub fn lib2() { unsafe {
*S = 1337;
} }
}
fn main() {
let x = &mut 0;
other::lib1(&x);
*x = 42; // a write to x -- invalidates other pointers?
other::lib2();
assert_eq!(*x, 1337); // oops, the value changed
} Giving away So at least for And with Tree Borrows, even for I also realized I didn't reply to this part by @arielb1
If a local has storage then it must have a distinct address, even if that storage is uninitialized. The way to achieve potential equal addresses is to remove its storage. We already have a way to express that: So basically, replace the Now, you bring up rustc performance. Given that we have explicit |
Stop considering moved-out locals when computing auto traits for generators Addresses rust-lang#94067 (but does not fix it since drop-tracking-mir is unstable) This PR, unlike rust-lang#110420 or rust-lang#112050, does not attempt to reduce the number of live locals across suspension points. It however ignores moved-out locals for trait purposes. So this PR solves the non-send issue, but not the size issue. Suggested by `@RalfJung` in [rust-lang/unsafe-code-guidelines#188](rust-lang/unsafe-code-guidelines#188 (comment)) cc `@b-naber` who's working on this from a different perspective. r? `@cjgillot`
Stop considering moved-out locals when computing auto traits for generators (rebased) This PR revives rust-lang#112279. I wanted to reopen this to gauge `@cjgillot's` thoughts about this, since it's been a while since `-Zdrop-tracking-mir` has landed. If this PR looks OK, I can flesh it out and write up an FCP report -- I think this is a T-types FCP, since this has to do with the types that are considered live for auto traits, and doesn't otherwise affect the layout of coroutines. Open questions: * **"Why do we require storage to be live if the locals is not initialized?"** (rust-lang#112279 (comment)) * I agree that we should eventually fix the storage analysis for coroutines, but this seems like a problem that can be fixed after we fix witnesses for *the purposes of traits* here. * As far as I could tell, fixing the problem of moved locals for *storage* would require insertion of additional `StorageDead` statements, or would require further entangling the type system with the operational semantics in ways that T-opsem seemed uncomfortable with [when I asked](https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/Inserting.20.60StorageDrop.60.20after.20unconditional.20moves). cc `@RalfJung` * Relevant: rust-lang/unsafe-code-guidelines#188 Fixes rust-lang#128095 Fixes rust-lang#94067 r? `@cjgillot`
Miri currently considers
Copy
andMove
the same operation. There are some proposals in particular by @eddyb to make them not the same: it might be beneficial for optimizations to be able to rely on the data not being read again after aMove
. This would correspond to replacing the data byUndef
. It seems reasonable to do similar things for drop -- though that requires a precise definition of what is actually considered a drop (does that include just theDrop
terminator or also calls todrop_in_place
?).(Miri concern: One reason why I am a bit hesitant about this is that this makes read have a side-effect, so we'd have to make all read methods in Miri take the interpreter context by
&mut self
. Reads already have a side-effect in Stacked Borrows but that is fairly local and we just use aRefCell
there.)@eddyb what would be such optimizations that benefit from this, where
StorageDead
is not happening so we need to rely onMove
?Potential blockers that would prevent deinit-on-move:
Reasons to do deinit-on-move:
Testcases:
The text was updated successfully, but these errors were encountered: