Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify safety of layout_for_ptr #117991

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

CAD97
Copy link
Contributor

@CAD97 CAD97 commented Nov 16, 2023

There is one functional change in the language update: we no longer document the behavior when given an external type, letting it fall into the "unknown" unsized tail kinds which are forbidden from use. This is in preparation for hopefully stabilizing this functionality in some form; the behavior of external types is still up in the air (recent temperature being to attempt adding a new default bound to current editions in order to forbid querying the layout external types entirely).

@rustbot
Copy link
Collaborator

rustbot commented Nov 16, 2023

r? @thomcc

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 16, 2023
@CAD97
Copy link
Contributor Author

CAD97 commented Nov 16, 2023

@rust-log-analyzer

This comment has been minimized.

library/core/src/mem/mod.rs Show resolved Hide resolved
library/core/src/mem/mod.rs Outdated Show resolved Hide resolved
@rust-log-analyzer

This comment has been minimized.

@thomcc

This comment was marked as resolved.

@rustbot rustbot assigned cuviper and unassigned thomcc Feb 1, 2024
@cuviper

This comment was marked as outdated.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 11, 2024
/// equivalent to [`size_of::<T>()`][size_of]. The pointer is unused.
/// - If the unsized tail of `T` is a [slice], then the size of the *entire
/// value* (statically sized prefix plus dynamic tail) must fit in `isize`.
/// The pointer may be dangling or null; only the slice length is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or unaligned or whatever" -- do we really want to list all that?

I'd prefer if we said something like "The data part of the pointer is unused", except we don't have standard terminology for "data part" AFAIK (I mean the part that's not the metadata).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying something along the lines of "no read is performed" provides the property which is most directly relevant (and indirectly implies that the address is immaterial). So I'll suggest using

The pointer does not need to be dereferenceable, as only the slice length is used.

In all honesty, I'd personally prefer exposing unsafe fn(<T as Pointee>::Metadata) -> Layout instead of unsafe fn(*const T) -> Layout, as it directly encodes this property, and I don't see read-performing size_of as particularly useful; both because of *const UnsafeCell<ThinCStr> not being able to provide it and because I've found explicit conversion between &CStr and Thin<&CStr> (a pointer wrapper stored as just NonNull<()>) to not cause issue in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair. But then we won't have anything on stable until Metadata is stable...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to

The pointer does not need to be valid for access, as only the pointer metadata is used.

This sticks to language already defined in the docs.

/// wide pointer metadata is used.
/// - For any other unsized tail kind (for example, unstable [extern types]),
/// it is currently *not permitted* to call this function. Unknown unsized
/// tail kinds may impose arbitrary requirements unknowable to current code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not permitted means... UB?

Copy link
Contributor Author

@CAD97 CAD97 May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "not allowed" wording comes from the existing docs (which I also wrote). The gist is "the safety condition depends on the pointee type and if it's not one of these documented kinds you cannot satisfy safety conditions which you don't know" but that's a bit of a mouthful. I also would prefer not to say "if size_of_val(&*ptr) is safe, size_of_val_raw(&*ptr) is safe," because that's not necessarily true, especially if we take the option of making ?Sized mean ?DynSized and have size_of_val panic for types where size can't be determined, instead of the more complicated path of adding support for types more unsized than ?Sized1.

So I suppose yes, it would be (library) undefined behavior.

Footnotes

  1. Which actually works mostly fine for any type which has a default Sized bound and supports the ?Sized bound, but quickly gets (maybe impossibly) messy for ones that don't have a default Sized bound applied, such as Self in trait default method bodies and generic types' associated type projections (when the associated type is relaxed further than ?Sized, which must happen2 because Deref::Target needs to be fully unbound for anything to work).

  2. Perhaps if we develop a solution for migrating to GAT-ified traits, that same solution could be used for only sometimes implying a MetaSized bound on the associated type. My implementation attempt didn't inspire hope.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all I can tell "not permitted" means "precondition: False" means unreachable_unchecked(). So it seems more honest to just say that it's UB, possibly followed by a remark that on current stable Rust, no such type exists.

FWIW in Miri this will panic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to say it's UB, but didn't add any note that current stable doesn't have such tails. That the example is unstable and it does talk about "unknown unsized tail kinds" does imply that it's about future compatibility, though.

@RalfJung
Copy link
Member

I'd love to see some progress here. :)

@CAD97 what is this blocked on? Figuring out who can approve it?

@CAD97
Copy link
Contributor Author

CAD97 commented May 14, 2024

Finding someone to approve it, but also me forgetting that there's a silly doc link error that needs to be fixed (which is why it's still labeled waiting on author). I'll address that and your comments today or tomorrow (ping me if I haven't in the next 48 hours, thanks).

@CAD97 CAD97 force-pushed the layout-for-ptr branch from 3e4b6f5 to 5c63cec Compare June 4, 2024 20:46
@CAD97
Copy link
Contributor Author

CAD97 commented Jun 4, 2024

@rustbot ready

Finally got around to fixing the broken link and addressing the nits. Rebase was necessary to be able to check doc build locally.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 4, 2024
/// [extern types]: ../../unstable-book/language-features/extern-types.html
///
/// It is important to note that the last point means that it would be *unsound*
/// to implement `size_of_val` as an unconditional call to `size_of_val_raw`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem desirable. Shouldn't we guarantee that if the argument would be sound to cast to a reference, then it may also be passed to size_of_val_raw and behavior is equivalent to size_of_val?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree it's unfortunate, but the reasoning for this is hedging with respect to future unsized tail kinds. It's certainly sufficient for any MetaSized type, and Rust can't yet manipulate properly !MetaSized types, but exotic size kinds can cause issues.

My go-to example is UnsafeCell<ThinCStr>. Getting the size of a value requires reading the value, which is a potential data race depending on whatever the external synchronization is.

This is, to be honest, "I don't want to deal with it" flavored UB. Even if size_of_val panics or is a post-monomorphization error, there does need to be an option which actually gets the layout (if it's knowable).

The lower floor is that if you can make a Box holding it, size_of_val_raw on a uniquely owned allocation with a valid (but potentially dropped) value is allowed (e.g. when dropping the last Weak<T> and deallocating). Beyond that is as far as I'm aware still undecided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think size_of_val_raw should keep exposing the same underlying intrinsic as size_of_val. For these hypothetical future custom DST, whatever they do to keep size_of_val sound should also apply to size_of_val_raw. They will have to introduce new mechanisms anyway, so those should be used to deal with the extra constraints / provide new even-less-safe operations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems prudent to me not to make any guarantees about those hypothetical types. If we call it UB now, we can still change that to some defined behavior later, and even now the implementation could use the same intrinsic and possibly work, or explode. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already made the guarantee that the size of every &T can be computed in safe code -- by providing size_of_val. I don't see what we'd lose by saying that the same can be done for every *const T that satisfies all the requirements for an &T. Not making that guarantee, OTOH, sounds like it would create a lot of potential for confusion and uncertainty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've slowly come around to agreeing that guaranteeing that {if size_of_val(&*p) is valid, size_of_val_raw(p) is also valid and produces the same result} is reasonable. It is the "raw" version of the same operation, after all. The "future unsized tail kind" clause would then defer to this general case and call out that this requires the pointer to be valid for reads that may occur, unlike the other unsize kinds.

What may have convinced me the most was a realization that the "more raw" version should probably live in the ptr module and not the mem module anyway, or be a method of raw pointers like it is of the DynMetadata type. Having three versions of the same functionality isn't great (especially with Layout versions existing as well as the split size/align), but I don't think that's bad enough to justify

What's mostly being said in guaranteeing that imo is a soft & loose commitment to one of:

  • future unsized tailed types can always be passed to size_of_val (i.e. they satisfy Self: ?edition2015#Sized), causing erroneous behavior (complains to sanitizers) if they don't provide a way to get size from &self, then
    • panicking1 or
    • causing a postmono error if known at mono time with a panicking shim in the vtable; or that
  • future unsized tailed types cannot be passed to size_of_val (i.e. they do not satisfy Self: ?edition2015#Sized) unless they provide a way to get size from &self (e.g. Self: DynSized).

Unfortunately, the desire to avoid even implying any decision there is a primary part of why feature(layout_for_ptr) has been stuck in unstable limbo and why this PR proposes to weaken it to the minimally useful baseline (i.e. explicitly not saying anything about unsized tail kinds that do not yet exist).

"It's just size_of_val with these relaxations for 'MetaSized' types" does also feel conservative, but I've yet to find a way to word that without implying that the functionality works for unsize kinds that haven't been mentioned. And due to all that, size_of_val::<&UnsafeCell<ThinCStr>> haunts me.

Footnotes

  1. Returning wrong results is very scary. Consider a version of mem::swap or even take_mut that works for T: ?Sized by using a dynamically allocated heap buffer instead of a statically sized stack slot, checking the runtime queried size/align/metadata to ensure that it all lines up. Or even just feature(unsized_locals). These turn extern type claiming a layout of (0, 1) into a nicely hidden soundness footgun.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make size_of_val panic / mono-time-error for such hypothetical future types, we can do the same for size_of_val_raw. If we find a way to statically prevent them from being passed to size_of_val, we can likewise do the same for size_of_val_raw. I think that's what you are saying?

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 10, 2024

Procedural note: should wait on #126152 to merge and rebase to explicitly include that guarantee.

@bors
Copy link
Contributor

bors commented Jul 24, 2024

☔ The latest upstream changes (presumably #128142) made this pull request unmergeable. Please resolve the merge conflicts.

@cuviper
Copy link
Member

cuviper commented Aug 27, 2024

Is this procedurally clear now? r=me if so, after rebase. (and perhaps credit other reviewers too!)

@RalfJung
Copy link
Member

Well, first of all it needs a rebase and incorporating the guarantee from #126152.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 27, 2024
@alex-semenyuk alex-semenyuk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 26, 2024
@JohnCSimon
Copy link
Member

@CAD97

ping from triage - can you post your status on this PR? This PR has not received an update in a few months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.