-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: core::mem::replace_with
for temporarily moving out of ownership.
#1736
Conversation
This common pattern deserves a place in the standard library.
cc. @Sgeo @makoConstruct |
# Detailed design | ||
[design]: #detailed-design | ||
|
||
A complete implementation can be found [here](https://github.com/rust-lang/rust/pull/36186). This implementation is pretty simple, with only one trick which is crucial to safety: handling unwinding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just include the code here? The implementation is short, and including makes the RFC self-contained.
So, from what I'm reading, panicking inside of the closure results in implementation defined behavior? That seems... bad. I'd rather just define it now to, well, abort is my preference. |
@ubsan Undefined != Unspecified. Check the comments from the code. Unwinding will make it abort, but the behavior is unspecified, and might thus change in the future (just like the dropping order of a vector's elements and arithmetic overflows). Edit: @usban edited his comment. Let me just reply to the new edition of it: Well, the point is to keep all possibilities open, and frankly, there might be even more in the future. One thing we could define is that it mirrors the behavior of destructors in destructors. |
Speaking of keeping all possibilities open, is there any reason this couldn't be implemented to simply mem::forget the old value to avoid double-drop without aborting or catching the panic? I'm not sure that's what I'd prefer, but it's what I intuitively expected before I saw the abort() in your PR. |
@lxrec |
@ticki 1) not a male. 2) I am pretty sure I wrote "implementation defined behavior" in my original comment. I don't like implementation defined behavior. And goddamnit, check the gender of the people you're talking about. This is getting ridiculous. When you say "unspecified", that means "undefined behavior", by the way. A compiler could do anything, unless you actually give it options of what to do. What, exactly, in your RFC is stopping a compiler from writing to null in the case of a panic? |
The fact that it is safe denotes that it is not breaking invariants (and thus leading to UB). |
@ticki I've seen a lot of broken safe code :3 My opinion is, obviously, that we should define it. However, I think if we don't, we should give specific options to take, because that is what unspecified is to me. An implementation chooses one of a specific set of options, and doesn't have to document it (unlike implementation defined, where an implementation should document it). |
Well, safe code (while it could be broken) can always be assumed to work right. Safe code shouldn't break any invariants.
There are already a lot of unspecified things in standard library. Vec drop order comes to my mind. Overflows is a case too. |
@ticki Overflows are defined to be two's complement for signed integers, and reset to zero for unsigned integers OR panic. Vec drop order (as well as all drop orders, except for scope) are unspecified, but they are given (implicit) options to choose from; forwards or backwards. |
Please link. RFC 560 says otherwise.
Well, not really. In theory, any order could be used. In the end, it is unspecified, and having panics unspecified here is neither unsafe nor dangerous. The type system still expresses an important invariant, i.e. that panicking is safe. Whether it aborts, overwrites the value, or anything else is certainly important as well, but the price paid for specifying it is stagnation. |
And, again, unspecified behavior is completely unrelated to undefined behavior. |
C++17 §1.3.26[defns.unspecified]
Could the RFC just specify some options to choose from (i.e. abort = alternative 1 / catch_unwind = alternative 2 / |
Eating your laundry is definitely a guarantee. Haven't you read the Rust website? |
@ticki the version of RFC 560 you linked is a pre-acceptance draft. The accepted version is here and doesn't leave things unspecified. Anyway I would agree that a safe function being documented to do something unspecified does not imply that it could do something unsound. In this case I haven't seen any suggestions as to what the code could do besides aborting, so it might stop the argument if we just defined that (I would like to leave the option open to print something if |
[motivation]: #motivation | ||
|
||
A common pattern, especially for low-level programmers, is to acquire ownership | ||
of a value behind a reference without immediately giving it a replacement value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include a simple example of this pattern and the problem it causes?
This is probably obvious, but it's early in the morning and I've only had 2 cups of coffee (clearly not enough). Can you say more about the connection to the behavior of unwinding in a dtor? (Do you mean specifically double panic? Or just any dtor in a dtor?) |
I'm generally 👍 on offering this API, I agree it is a very useful primitive. I am wary of the fact that it can abort, which I think is unacceptable for some applications -- though this is currently a bigger problem in Rust, of course, given the "panic during unwinding" behavior (not to mention OOM behavior, stack overflow behavior, and a few other such things). But I can imagine many cases, particularly unsafe code, where it's evident that everything will be fine from inspection. EDIT: It might be a good idea to be careful with the documentation to appropriately convey the caution that is required here =) |
It's a typo. I meant "panic in destructors". |
Let me rephrase. Is there some particular semantics for panicing in destructors (or during unwinding, or both) that would make it possible for you to do something other than abort here? |
Could this be specialised such that if |
@plietar The point is exactly that there is no substituded value. Indeed, you can use simply |
|
This came up again on reddit This is a really common thing for beginners to reach for |
As the reason for the FCP was that having an aborting function was too big a footgun for the standard library, here is an idea that I think has not been suggested yet. Maybe it would make sense to introduce this function as an The drawback of this idea being that when using the closure in-place it is automatically unsafe, which isn't great… but it's probably still better than having everyone re-implement it on their own? |
An unsafe version in the standard library is really not much gain at all compared to someone just using a crate for this. |
No, it's not fine to mark a safe function as |
I think this should be introduced into the |
I have a use case that is (slightly) different and more straightforward than others mentioned here, so I figured I'd put it here. I have a function that takes ownership of an To do this, as far as I'm aware, what I currently have to do is this: let e: Box<Expr> = ...; // I get my Box<Expr> from somewhere
let converted: Box<Expr> = Box::new(convert(*expr)); // deref, then create new heap allocation which seems quite inefficient, as I'm taking something out of the box, only to map it to some other value, and put it back into a new This RFC would let me do so very easily: let mut e: Box<Expr> = ...; // I get my Box<Expr> from somewhere
let converted: Box<Expr> = std::mem::replace_with(&mut e, convert); // no reallocation |
@ashpil you can safely mutate the box in-place like this: struct Expr {}
fn convert(e: Expr) -> Expr { e }
fn work_on_box(mut e: Box<Expr>) {
*e = convert(*e);
} |
My library Rust + |
FWIW, I consider https://crates.io/crates/take_mut to be sound, and therefore consider your crate to be unsound. (Haven't looked at your code, this is based on your statement about the assumptions your library makes.) Providing this facility in the Rust standard library would help clarify what the safety requirements of mutable references are. We have to make a choice one way or the other, as an ecosystem -- it doesn't help to punt on this, that just risks making crate composition unsound, which is not a desirable end state. Adding take_mut to the stdlib would not be a breaking change -- it does not break any promise that has been stably made in the past. |
Is there an explanation somewhere of why partial-borrow + take_mut is unsound? The semantic model of RustBelt proves soundness of take_mut, so I wonder where your crate does something that conflicts with RustBelt. |
Is officially deeming third-party crates like |
It's not really an op.sem question, this is a question of the exact contract of the type system (the "safety invariant" of mutable references). We don't have a team explicitly chartered with that (we're not there yet), but I'd imagine it would involve UCG and t-types. FWIW the opinion I stated above was my own, not necessarily reflecting any team consensus. |
Using
I went to look at RustBelt, and skimread the technical appendix; I don't have the time needed to grapple with this in a formal way. But, inferring from what you say, and my reading didn't contradict this, I don't think RustBelt encodes the language limitation that Whether one regards that as an unsoundness in
Yes, quite so. Franklly, I won't be surprised if this is resolved in favour of I think, though, that there may be other libraries that rely on ZST references in a way that relies on the Rust programmer's inability to obtain @vi:
Yes. You can
Surely. At least UCG/opsem must be able to do that by implication, via the semantic rules; and making explicit statements about particular crates seems more practically helpful, and also more socially appropriate: after all, if UCG/opsem are declaring a crate unsound by implication it would be much better to explicitly acknowledge that. |
It is definitely interesting to know that take_mut even is part of one of these "semantic conflicts". This is not something I expected, so thanks for bringing this up.
If you care about the address where some data lives, then the standard library has a way of expressing that -- it's pinning. I know it is not very ergonomic, but mutable references are (to me) explicitly designed to reflect movable data. The "natural" way to define a safety invariant for mutable references does allow
So if, hypothetically, we had support for field projections on user-defined types (something that does come up fairly regularly), then this usecase would disappear? |
I think if your |
You're welcome. Thanks for engaging :-).
But that's not really what's going on. It's not that I care where the data lives. (The original Really, I'm using Or to put it another way, what I'm doing is a stunt.
Quite. I think stunts like this are only feasible with ZSTs. The putative other libraries that I am imagining may also do something similar, would also have to use ZSTs. Blocking
tl;dr: Probably, yes, and the result would be good. There are two things that Even with this stunt, the field projections aren't perfect. With current Rust, I have to arrange for my field ZSTs, post-projection, to The other is that an
This doesn't work if you wrap the reference up in a newtype. There doesn't seem to be a way to tell the compiler that a newtype is "borrowably Copy" and get access to the same borrowck feature. For But in practice this is less useful than it seems because one must usually "downgrade" a partial borrow to a subset, since the method often needs strictly less access, so a call to .as_mut() or something is needed anyway. Another advantage of the newtype around
That seems like an approach which would work, indeed. It could also be appropriate for people who are using ZST address marker types for other applications. |
Oh, I just realised. That wouldn't work for |
You said yourself that the relevant invariant is that "every &mut Partial has the same address as an original non-ZST T". That's why, when someone moves the Partial to somewhere else via Literally the only assumption |
Does |
This common pattern deserves a place in the standard library.