Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weak refcounted pointers can dangle #80407

Closed
wants to merge 1 commit into from

Conversation

CAD97
Copy link
Contributor

@CAD97 CAD97 commented Dec 27, 2020

... so fix internal comments to reflect as such.

This is a followup to the changes in #74160. cc @rust-lang/lang: we did FCP on said PR (allowing Weak::into_raw/from_raw) with unfortunately incomplete information. Thankfully, this doesn't mean anything yet, so long as we're careful. Explanation below.

cc specifically @RalfJung, @mahkoh, from rust-lang/rfcs#3041, where this was brought to my attention.

@bors rollup=always
(comments only change)


The important example:

let ptr: Weak<[u8]> = Weak::<[u8; 32]>::new();
Weak::from_raw(Weak::into_raw(ptr));

Unsize coercions work on Weak, so we've been able to have dangling Weak to unsized T since Weak::new was stable.

This is where #74160 comes in. Now we have Weak::as_raw, Weak::into_raw, and Weak::from_raw. These mean that it's possible to go from "whatever pointer Weak<T> stores" to *const T and back, even in the case that T: ?Sized (as of #74160). This offsetting currently requires knowing the alignment.

When #74160 was merged, it did so without realizing that unsized dangling Weak could be constructed, and we said it stabilized the use of align_of_val_raw for "potentially dangling pointers to sized T and valid pointers to unsized T". We unfortunately ignored "valid pointers to dropped unsized T" and "dangling pointers to unsized T unsized with valid metadata."

The simple assumption from this API being stable now is that any future work around custom (potentially thin) DSTs will require all DSTs to have alignment determinable from the pointer metadata (and thus align_of_val_raw can be safe). However, with careful tweaks to the implementation of ArcInner/RcBox[1], we can avoid this requirement.

The issue at hand is that converting between Weak<T> and *const T requires knowing the alignment of T. There are three ways to sidestep this requirement:

  • Don't offset dangling weak in the three methods from Allow Weak::as_ptr and friends for unsized T #74160 (or otherwise detect and use another sentinel). This would add a branch to these methods, but would also be potentially useful (as we could give out a null pointer for dangling weak).
  • Always store Weak<T> as a pointer-to-T, rather than pointer-to-ArcInner/RcBox[1]. This pessimizes the non-dangling case, as it will always have to align_of_val and backwards offset to touch the reference count.
  • Drop #[repr(C)] layout and store with front padding instead of internal padding, PLUS store a reference to the reference count (rather than the allocation head). This only pessimizes the final deallocation (as it has to offset backwards to get the allocation head to deallocate), but it greatly complicates the implementation, just so that the offset between the Weak<T> pointer and *const T pointer is a consistent two words.
  • And of course, the fourth, just say that this is allowed, and require any thin DSTs to have statically known alignment. (There is, though, at least one "dynamically aligned type" usecase I can think of: trait objects with thin pointer / inline vtable pointer.)

So, unless I'm mistaken, nothing about the stable Arc/Rc API strictly requires that align_of_val_raw is possible. It's just that the choice is between align_of_val_raw or making Arc/Rc even more complicated to avoid calling align_of_val_raw.

(Side note: some version of this analysis should probably go in lang-notes somewhere as notes on restrictions to any future design of custom (potentially thin) dynamically sized types.)

[1]: off-topic, but we really should probably unify those names, probably to ArcInner/RcInner

... so fix internal comments to reflect as such
@rust-highfive
Copy link
Collaborator

r? @joshtriplett

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 27, 2020
/// - if `T` is unsized, the pointer must have appropriate pointer metadata
/// acquired from the real instance that you are getting this offset for.
/// - if `T` is unsized, the pointer must have been acquired from an unsize
/// coercion (but may be invalid, such as from Weak::new).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not "the same safety requirements as align_of_val_raw", though. align_of_val_raw quite deliberately only talks slices, trait objects, and extern types -- for all other unsiezd tails, "it is not allowed to call this function".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super awkward to specify cleanly without copying over the entire docs from align_of_val_raw. When I made this adjustment, I figured that "comes from an unsize coercion" would be a tighter bound than align_of_val_raw spells out, since the only types of unsize coercion are for slices and trait objects. "Comes from unsize coercion" is in fact a simpler spelling of the requirement for trait objects (valid vptr). For slices, it's exactly equivalent iff it's impossible to name a fixed-size array type that has a layout close to the size of isize::MAX (such that RcInner<T> is too big). I think that the error "values of the type rc::RcBox<[u128; N]> are too big for the current architecture" is enough for that, though I do think relying on said error for soundness is suboptimal.

(Also, we could potentially relax align_of_val_raw to limit the magnitude of alignment (which it probably should anyway) rather than size.)

Copy link
Member

@RalfJung RalfJung Dec 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I don't like "comes from an unsize coercion" is that it is open-ended -- it is true for the unsizing coercions that currently exist, but that set might change in the future. The align_of_val_raw docs are quite deliberately stated in a way that is not open-ended in this way.

That's also why "comes from an unsize coercion" is not tighter than the bound on align_of_val_raw.

@CAD97
Copy link
Contributor Author

CAD97 commented Dec 27, 2020

Side note: if I counted right, #74160 is on beta right now (going stable on Jan 1), so if we're super worried about this, we could technically still back it out. I think it's unnecessary, but I don't make the call, and it's the safe option.

@joshtriplett
Copy link
Member

@CAD97 Thank you for bringing this up. Even though it might not be necessary, I don't want anyone having to deal with this in a hurry over the holidays. Let's go over the long-term implications with you in more detail at a @rust-lang/lang design meeting in January (the 2021-01-06 meeting is still available), and make the decision then.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 28, 2020
…Simulacrum

de-stabilize unsized raw ptr methods for Weak

`@Mark-Simulacrum` this is the patch re: rust-lang#80407.

I couldn't figure out the branch it needs to go on though, stable is still the old stable but beta already the new beta...?
@camelid camelid added T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Dec 28, 2020
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Dec 30, 2020
…r=m-ou-se

Do not create dangling &T in Weak<T>::drop

Since at this point all strong pointers have been dropped, the wrapped `T` has also been dropped. As such, creating a `&T` to the dropped place is negligent at best (language UB at worst). Since we have `Layout::for_value_raw` now, use that instead of `Layout::for_value` to avoid creating the `&T`.

This does have implications for custom (potentially thin) DSTs, though much less severe than those discussed in rust-lang#80407. Specifically, one of two things has to be true:

- It has to be possible to use a `*const T` to a dropped (potentially custom, potentially thin) unsized tailed object to determine the layout (size/align) of the object. This is what is currently implemented (though with `&T` instead of `&T`). The validity of reading some location after it has been dropped is an open question IIUC (rust-lang/unsafe-code-guidelines#188) (except when the whole type is `Copy`, per `drop_in_place`'s docs).
  In this design, custom DSTs would get a `*mut T` and use that to return layout, and must be able to do so while in the "zombie" (post-drop, pre-free) state.
- `RcBox`/`ArcInner` compute and store layout eagerly, so that they don't have to ask the type for its layout after dropping it.

Importantly, this is already true today, as you can construct `Rc<DST>`, create a `Weak<DST>`, and drop the `Rc` before the `Weak`. This PR is a strict improvement over the status quo, and the above question about potentially thin DSTs will need to be resolved by any custom DST proposal.
@CAD97
Copy link
Contributor Author

CAD97 commented Jan 6, 2021

Closing in favor of #80764, per T-lang consensus in today's meeting.

@CAD97 CAD97 closed this Jan 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants