Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add KindaSortaDangling internal type to make Yoke safe under strong protection #3735

Merged
merged 11 commits into from
Jul 26, 2023
35 changes: 19 additions & 16 deletions components/datetime/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -204,35 +204,38 @@ mod tests {

#[test]
fn check_sizes() {
check_size_of!(5800 | 4592, DateFormatter);
check_size_of!(6792 | 5456, DateTimeFormatter);
check_size_of!(7904 | 6480, ZonedDateTimeFormatter);
check_size_of!(1496 | 1320, TimeFormatter);
check_size_of!(5800 | 4632, DateFormatter);
Copy link
Member

@sffc sffc Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet the compiler was previously able to "merge" the T with whatever other fields were present on the struct, but now with the MaybeUninit it is forced to keep the payload as its own chunk of memory. A bit sad.

EDIT: Nevermind, that doesn't make sense, because it's still possible to get a &T, so the compiler doesn't have the liberty to fiddle with the layout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it only makes sense if there's an enum wrapping it, and the only enum I see is the thing that swaps between yokes and staticrefs, and I'm not quite sure how that could have a size impact.

Copy link

@danielhenrymantilla danielhenrymantilla Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like if the payload of two variants have a common part, and if one of the payloads separately has enough of a niche for Rust to be able to stuff the enum's discriminant inside, then it does so, effectively reducing the size of the whole enum. In which case the niche removal of MaybeUninit does effectively impact the overall size.

use ::core::mem::{MaybeUninit as MU, size_of};

enum Enum<T> {
    A(T, u8),
    B(   u8),
}

assert_eq!(2, size_of::<Enum<    bool  >>());
assert_eq!(3, size_of::<Enum< MU<bool> >>());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! thanks for investigating!

Though the DataPayload enum here doesn't have variants with common parts before it gets to the yoke

check_size_of!(6792 | 5504, DateTimeFormatter);
check_size_of!(7904 | 6528, ZonedDateTimeFormatter);
check_size_of!(1496 | 1344, TimeFormatter);
check_size_of!(1112 | 1024, TimeZoneFormatter);
check_size_of!(5752 | 4544, TypedDateFormatter::<Gregorian>);
check_size_of!(6744 | 5408, TypedDateTimeFormatter::<Gregorian>);
check_size_of!(5752 | 4584, TypedDateFormatter::<Gregorian>);
check_size_of!(6744 | 5456, TypedDateTimeFormatter::<Gregorian>);

check_size_of!(88, DateTimeError);
check_size_of!(200, FormattedDateTime);
check_size_of!(16, FormattedTimeZone::<CustomTimeZone>);
check_size_of!(184, FormattedZonedDateTime);
check_size_of!(13, DateTimeFormatterOptions);

if cfg!(feature = "experimental") {
check_size_of!(13, DateTimeFormatterOptions);
}

type DP<M> = DataPayload<M>;
check_size_of!(208, DP::<PatternPluralsFromPatternsV1Marker>);
check_size_of!(1032 | 904, DP::<TimeSymbolsV1Marker>);
check_size_of!(216, DP::<PatternPluralsFromPatternsV1Marker>);
check_size_of!(1032 | 912, DP::<TimeSymbolsV1Marker>);
check_size_of!(40, DP::<GenericPatternV1Marker>);
check_size_of!(208, DP::<PatternPluralsFromPatternsV1Marker>);
check_size_of!(5064 | 3904, DP::<ErasedDateSymbolsV1Marker>);
check_size_of!(16, DP::<WeekDataV1Marker>);
check_size_of!(288 | 232, DP::<TimeZoneFormatsV1Marker>);
check_size_of!(64 | 56, DP::<ExemplarCitiesV1Marker>);
check_size_of!(216, DP::<PatternPluralsFromPatternsV1Marker>);
check_size_of!(5064 | 3912, DP::<ErasedDateSymbolsV1Marker>);
check_size_of!(24, DP::<WeekDataV1Marker>);
check_size_of!(232 | 240, DP::<TimeZoneFormatsV1Marker>);
check_size_of!(64, DP::<ExemplarCitiesV1Marker>);
check_size_of!(120 | 112, DP::<MetazoneGenericNamesLongV1Marker>);
check_size_of!(120 | 112, DP::<MetazoneGenericNamesShortV1Marker>);
check_size_of!(216 | 208, DP::<MetazoneSpecificNamesLongV1Marker>);
check_size_of!(216 | 208, DP::<MetazoneSpecificNamesShortV1Marker>);
check_size_of!(168, PluralRules);
check_size_of!(256 | 208, FixedDecimalFormatter);
check_size_of!(176, PluralRules);
check_size_of!(256 | 216, FixedDecimalFormatter);
check_size_of!(1024 | 936, TimeZoneDataPayloads);
check_size_of!(3, TimeZoneFormatterUnit);
}
Expand Down
83 changes: 83 additions & 0 deletions utils/yoke/src/kinda_sorta_dangling.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

use core::mem::{ManuallyDrop, MaybeUninit};
use core::ops::{Deref, DerefMut};

/// This type is intended to be similar to the type `MaybeDangling<T>`
/// proposed in [RFC 3336].
///
/// The effect of this is that in Rust's safety model, types inside here are not
/// expected to have any memory dependent validity properties (`dereferenceable`, `noalias`).
///
/// See [#3696] for a testcase where `Yoke` fails this under miri's field-retagging mode.
///
/// This has `T: 'static` since we don't need anything
/// else and we don't want to have to think (more) about variance.
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
///
/// After [RFC 3336] lands we can use `MaybeDangling` instead.
///
/// [RFC 3336]: https://github.com/rust-lang/rfcs/pull/3336
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
/// [#3696]: https://github.com/unicode-org/icu4x/issues/3696
#[repr(transparent)]
pub(crate) struct KindaSortaDangling<T: 'static> {
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
/// Safety invariant: This is always an initialized T, never uninit or other
/// invalid bit patterns. Its drop glue will execute during Drop::drop rather than
/// during the drop glue for KindaSortaDangling, which is
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
dangle: MaybeUninit<T>,
}

impl<T: 'static> KindaSortaDangling<T> {
#[inline]
pub(crate) const fn new(dangle: T) -> Self {
KindaSortaDangling {
dangle: MaybeUninit::new(dangle),
}
}
#[inline]
pub(crate) fn into_inner(self) -> T {
// Self has a destructor, we want to avoid having it be called
let manual = ManuallyDrop::new(self);
// Safety:
// We can call assume_init_read() due to the library invariant on this type,
// however since it is a read() we must be careful about data duplication.
// The only code using `self` after this is the drop glue, which we have disabled via
// the ManuallyDrop.
unsafe { manual.dangle.assume_init_read() }
}
}

impl<T: 'static> Deref for KindaSortaDangling<T> {
type Target = T;
#[inline]
fn deref(&self) -> &T {
// Safety: Safe due to the safety invariant on `dangle`;
// we can always assume initialized
unsafe { self.dangle.assume_init_ref() }
}
}

impl<T: 'static> DerefMut for KindaSortaDangling<T> {
#[inline]
fn deref_mut(&mut self) -> &mut T {
// Safety: Safe due to the safety invariant on `dangle`;
// we can always assume initialized
unsafe { self.dangle.assume_init_mut() }
}
}

impl<T: 'static> Drop for KindaSortaDangling<T> {
#[inline]
fn drop(&mut self) {
unsafe {
// Safety: We are reading and dropping a valid initialized T.
//
// As read() is a duplication operation we must be careful that the original value isn't
// used afterwards. It won't be because this is drop and the only
// code that will run after this is `self`'s drop glue, and that drop glue is empty
// because MaybeUninit has no drop.
let _drop = self.dangle.assume_init_read();
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
}
}
}
1 change: 1 addition & 0 deletions utils/yoke/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ extern crate alloc;
pub mod either;
#[cfg(feature = "alloc")]
pub mod erased;
mod kinda_sorta_dangling;
mod macro_impls;
pub mod trait_hack;
mod yoke;
Expand Down
52 changes: 32 additions & 20 deletions utils/yoke/src/yoke.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
use crate::either::EitherCart;
#[cfg(feature = "alloc")]
use crate::erased::{ErasedArcCart, ErasedBoxCart, ErasedRcCart};
use crate::kinda_sorta_dangling::KindaSortaDangling;
use crate::trait_hack::YokeTraitHack;
use crate::Yokeable;
use core::marker::PhantomData;
Expand Down Expand Up @@ -77,7 +78,7 @@ use alloc::sync::Arc;
pub struct Yoke<Y: for<'a> Yokeable<'a>, C> {
// must be the first field for drop order
// this will have a 'static lifetime parameter, that parameter is a lie
yokeable: Y,
yokeable: KindaSortaDangling<Y>,
cart: C,
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
}

Expand Down Expand Up @@ -154,7 +155,7 @@ where
{
let deserialized = f(cart.deref());
Self {
yokeable: unsafe { Y::make(deserialized) },
yokeable: KindaSortaDangling::new(unsafe { Y::make(deserialized) }),
cart,
}
}
Expand All @@ -170,7 +171,7 @@ where
{
let deserialized = f(cart.deref())?;
Ok(Self {
yokeable: unsafe { Y::make(deserialized) },
yokeable: KindaSortaDangling::new(unsafe { Y::make(deserialized) }),
cart,
})
}
Expand Down Expand Up @@ -443,7 +444,10 @@ impl<Y: for<'a> Yokeable<'a>> Yoke<Y, ()> {
/// assert_eq!(yoke.get(), "hello");
/// ```
pub fn new_always_owned(yokeable: Y) -> Self {
Self { yokeable, cart: () }
Self {
yokeable: KindaSortaDangling::new(yokeable),
cart: (),
}
}

/// Obtain the yokeable out of a `Yoke<Y, ()>`
Expand All @@ -452,7 +456,7 @@ impl<Y: for<'a> Yokeable<'a>> Yoke<Y, ()> {
/// fine for `Yoke<Y, ()>` since there are no actual internal
/// references
pub fn into_yokeable(self) -> Y {
self.yokeable
self.yokeable.into_inner()
}
}

Expand Down Expand Up @@ -483,7 +487,7 @@ impl<Y: for<'a> Yokeable<'a>, C: StableDeref> Yoke<Y, Option<C>> {
/// ```
pub const fn new_owned(yokeable: Y) -> Self {
Self {
yokeable,
yokeable: KindaSortaDangling::new(yokeable),
cart: None,
}
}
Expand All @@ -495,7 +499,7 @@ impl<Y: for<'a> Yokeable<'a>, C: StableDeref> Yoke<Y, Option<C>> {
pub fn try_into_yokeable(self) -> Result<Y, Self> {
match self.cart {
Some(_) => Err(self),
None => Ok(self.yokeable),
None => Ok(self.yokeable.into_inner()),
}
}
}
Expand Down Expand Up @@ -541,7 +545,7 @@ where
// We have an &T not a T, and we can clone YokeTraitHack<T>
let this_hack = YokeTraitHack(this).into_ref();
Yoke {
yokeable: unsafe { Y::make(this_hack.clone().0) },
yokeable: KindaSortaDangling::new(unsafe { Y::make(this_hack.clone().0) }),
cart: self.cart.clone(),
}
}
Expand Down Expand Up @@ -655,9 +659,9 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
PhantomData<&'a ()>,
) -> <P as Yokeable<'a>>::Output,
{
let p = f(self.yokeable.transform_owned(), PhantomData);
let p = f(self.yokeable.into_inner().transform_owned(), PhantomData);
Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart,
}
}
Expand All @@ -678,7 +682,7 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
{
let p = f(self.get(), PhantomData);
Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart.clone(),
}
}
Expand Down Expand Up @@ -753,9 +757,9 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
PhantomData<&'a ()>,
) -> Result<<P as Yokeable<'a>>::Output, E>,
{
let p = f(self.yokeable.transform_owned(), PhantomData)?;
let p = f(self.yokeable.into_inner().transform_owned(), PhantomData)?;
Ok(Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart,
})
}
Expand All @@ -776,7 +780,7 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
{
let p = f(self.get(), PhantomData)?;
Ok(Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart.clone(),
})
}
Expand All @@ -797,9 +801,13 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
where
P: for<'a> Yokeable<'a>,
{
let p = f(self.yokeable.transform_owned(), capture, PhantomData);
let p = f(
self.yokeable.into_inner().transform_owned(),
capture,
PhantomData,
);
Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart,
}
}
Expand All @@ -824,7 +832,7 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
{
let p = f(self.get(), capture, PhantomData);
Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart.clone(),
}
}
Expand All @@ -847,9 +855,13 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
where
P: for<'a> Yokeable<'a>,
{
let p = f(self.yokeable.transform_owned(), capture, PhantomData)?;
let p = f(
self.yokeable.into_inner().transform_owned(),
capture,
PhantomData,
)?;
Ok(Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart,
})
}
Expand All @@ -875,7 +887,7 @@ impl<Y: for<'a> Yokeable<'a>, C> Yoke<Y, C> {
{
let p = f(self.get(), capture, PhantomData)?;
Ok(Yoke {
yokeable: unsafe { P::make(p) },
yokeable: KindaSortaDangling::new(unsafe { P::make(p) }),
cart: self.cart.clone(),
})
}
Expand Down
15 changes: 15 additions & 0 deletions utils/yoke/tests/miri_retag.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// This file is part of ICU4X. For terms of use, please see the file
// called LICENSE at the top level of the ICU4X source tree
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

use yoke::Yoke;

// Test for strong protection, should pass under miri with -Zretag-fields
Manishearth marked this conversation as resolved.
Show resolved Hide resolved
// See https://github.com/unicode-org/icu4x/issues/3696

fn example(_: Yoke<&'static [u8], Vec<u8>>) {}

#[test]
fn run_test() {
example(Yoke::attach_to_cart(vec![0, 1, 2], |data| data));
}
Loading