-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify safety of layout_for_ptr #117991
base: master
Are you sure you want to change the base?
Clarify safety of layout_for_ptr #117991
Conversation
r? @thomcc (rustbot has picked a reviewer for you, use r? to override) |
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
library/core/src/alloc/layout.rs
Outdated
/// equivalent to [`size_of::<T>()`][size_of]. The pointer is unused. | ||
/// - If the unsized tail of `T` is a [slice], then the size of the *entire | ||
/// value* (statically sized prefix plus dynamic tail) must fit in `isize`. | ||
/// The pointer may be dangling or null; only the slice length is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"or unaligned or whatever" -- do we really want to list all that?
I'd prefer if we said something like "The data part of the pointer is unused", except we don't have standard terminology for "data part" AFAIK (I mean the part that's not the metadata).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saying something along the lines of "no read is performed" provides the property which is most directly relevant (and indirectly implies that the address is immaterial). So I'll suggest using
The pointer does not need to be dereferenceable, as only the slice length is used.
In all honesty, I'd personally prefer exposing unsafe fn(<T as Pointee>::Metadata) -> Layout
instead of unsafe fn(*const T) -> Layout
, as it directly encodes this property, and I don't see read-performing size_of
as particularly useful; both because of *const UnsafeCell<ThinCStr>
not being able to provide it and because I've found explicit conversion between &CStr
and Thin<&CStr>
(a pointer wrapper stored as just NonNull<()>
) to not cause issue in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. But then we won't have anything on stable until Metadata
is stable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to
The pointer does not need to be valid for access, as only the pointer metadata is used.
This sticks to language already defined in the docs.
library/core/src/alloc/layout.rs
Outdated
/// wide pointer metadata is used. | ||
/// - For any other unsized tail kind (for example, unstable [extern types]), | ||
/// it is currently *not permitted* to call this function. Unknown unsized | ||
/// tail kinds may impose arbitrary requirements unknowable to current code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not permitted means... UB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "not allowed" wording comes from the existing docs (which I also wrote). The gist is "the safety condition depends on the pointee type and if it's not one of these documented kinds you cannot satisfy safety conditions which you don't know" but that's a bit of a mouthful. I also would prefer not to say "if size_of_val(&*ptr)
is safe, size_of_val_raw(&*ptr)
is safe," because that's not necessarily true, especially if we take the option of making ?Sized
mean ?DynSized
and have size_of_val
panic for types where size can't be determined, instead of the more complicated path of adding support for types more unsized than ?Sized
1.
So I suppose yes, it would be (library) undefined behavior.
Footnotes
-
Which actually works mostly fine for any type which has a default
Sized
bound and supports the?Sized
bound, but quickly gets (maybe impossibly) messy for ones that don't have a defaultSized
bound applied, such asSelf
in trait default method bodies and generic types' associated type projections (when the associated type is relaxed further than?Sized
, which must happen2 becauseDeref::Target
needs to be fully unbound for anything to work). ↩ -
Perhaps if we develop a solution for migrating to GAT-ified traits, that same solution could be used for only sometimes implying a
MetaSized
bound on the associated type. My implementation attempt didn't inspire hope. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all I can tell "not permitted" means "precondition: False" means unreachable_unchecked()
. So it seems more honest to just say that it's UB, possibly followed by a remark that on current stable Rust, no such type exists.
FWIW in Miri this will panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it to say it's UB, but didn't add any note that current stable doesn't have such tails. That the example is unstable and it does talk about "unknown unsized tail kinds" does imply that it's about future compatibility, though.
I'd love to see some progress here. :) @CAD97 what is this blocked on? Figuring out who can approve it? |
Finding someone to approve it, but also me forgetting that there's a silly doc link error that needs to be fixed (which is why it's still labeled waiting on author). I'll address that and your comments today or tomorrow (ping me if I haven't in the next 48 hours, thanks). |
@rustbot ready Finally got around to fixing the broken link and addressing the nits. Rebase was necessary to be able to check doc build locally. |
/// [extern types]: ../../unstable-book/language-features/extern-types.html | ||
/// | ||
/// It is important to note that the last point means that it would be *unsound* | ||
/// to implement `size_of_val` as an unconditional call to `size_of_val_raw`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem desirable. Shouldn't we guarantee that if the argument would be sound to cast to a reference, then it may also be passed to size_of_val_raw
and behavior is equivalent to size_of_val
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree it's unfortunate, but the reasoning for this is hedging with respect to future unsized tail kinds. It's certainly sufficient for any MetaSized
type, and Rust can't yet manipulate properly !MetaSized
types, but exotic size kinds can cause issues.
My go-to example is UnsafeCell<ThinCStr>
. Getting the size of a value requires reading the value, which is a potential data race depending on whatever the external synchronization is.
This is, to be honest, "I don't want to deal with it" flavored UB. Even if size_of_val
panics or is a post-monomorphization error, there does need to be an option which actually gets the layout (if it's knowable).
The lower floor is that if you can make a Box
holding it, size_of_val_raw
on a uniquely owned allocation with a valid (but potentially dropped) value is allowed (e.g. when dropping the last Weak<T>
and deallocating). Beyond that is as far as I'm aware still undecided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think size_of_val_raw should keep exposing the same underlying intrinsic as size_of_val. For these hypothetical future custom DST, whatever they do to keep size_of_val sound should also apply to size_of_val_raw. They will have to introduce new mechanisms anyway, so those should be used to deal with the extra constraints / provide new even-less-safe operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems prudent to me not to make any guarantees about those hypothetical types. If we call it UB now, we can still change that to some defined behavior later, and even now the implementation could use the same intrinsic and possibly work, or explode. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already made the guarantee that the size of every &T
can be computed in safe code -- by providing size_of_val
. I don't see what we'd lose by saying that the same can be done for every *const T
that satisfies all the requirements for an &T
. Not making that guarantee, OTOH, sounds like it would create a lot of potential for confusion and uncertainty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've slowly come around to agreeing that guaranteeing that {if size_of_val(&*p)
is valid, size_of_val_raw(p)
is also valid and produces the same result} is reasonable. It is the "raw" version of the same operation, after all. The "future unsized tail kind" clause would then defer to this general case and call out that this requires the pointer to be valid for reads that may occur, unlike the other unsize kinds.
What may have convinced me the most was a realization that the "more raw" version should probably live in the ptr
module and not the mem
module anyway, or be a method of raw pointers like it is of the DynMetadata
type. Having three versions of the same functionality isn't great (especially with Layout
versions existing as well as the split size/align), but I don't think that's bad enough to justify
What's mostly being said in guaranteeing that imo is a soft & loose commitment to one of:
- future unsized tailed types can always be passed to
size_of_val
(i.e. they satisfySelf: ?edition2015#Sized
), causing erroneous behavior (complains to sanitizers) if they don't provide a way to get size from&self
, then- panicking1 or
- causing a postmono error if known at mono time with a panicking shim in the vtable; or that
- future unsized tailed types cannot be passed to
size_of_val
(i.e. they do not satisfySelf: ?edition2015#Sized
) unless they provide a way to get size from&self
(e.g.Self: DynSized
).
Unfortunately, the desire to avoid even implying any decision there is a primary part of why feature(layout_for_ptr)
has been stuck in unstable limbo and why this PR proposes to weaken it to the minimally useful baseline (i.e. explicitly not saying anything about unsized tail kinds that do not yet exist).
"It's just size_of_val
with these relaxations for 'MetaSized
' types" does also feel conservative, but I've yet to find a way to word that without implying that the functionality works for unsize kinds that haven't been mentioned. And due to all that, size_of_val::<&UnsafeCell<ThinCStr>>
haunts me.
Footnotes
-
Returning wrong results is very scary. Consider a version of
mem::swap
or eventake_mut
that works forT: ?Sized
by using a dynamically allocated heap buffer instead of a statically sized stack slot, checking the runtime queried size/align/metadata to ensure that it all lines up. Or even justfeature(unsized_locals)
. These turnextern type
claiming a layout of(0, 1)
into a nicely hidden soundness footgun. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we make size_of_val
panic / mono-time-error for such hypothetical future types, we can do the same for size_of_val_raw
. If we find a way to statically prevent them from being passed to size_of_val
, we can likewise do the same for size_of_val_raw
. I think that's what you are saying?
Procedural note: should wait on #126152 to merge and rebase to explicitly include that guarantee. |
☔ The latest upstream changes (presumably #128142) made this pull request unmergeable. Please resolve the merge conflicts. |
Is this procedurally clear now? |
ping from triage - can you post your status on this PR? This PR has not received an update in a few months. |
There is one functional change in the language update: we no longer document the behavior when given an external type, letting it fall into the "unknown" unsized tail kinds which are forbidden from use. This is in preparation for hopefully stabilizing this functionality in some form; the behavior of external types is still up in the air (recent temperature being to attempt adding a new default bound to current editions in order to forbid querying the layout external types entirely).