Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a stop-gap “MaybeUninit” using ManuallyDrop #97

Closed
wants to merge 12 commits into from

Conversation

bluss
Copy link
Owner

@bluss bluss commented Mar 25, 2018

This is a cautious version of MaybeUninit, since we don't have one in
libstd, based on ManuallyDrop.

This moves ArrayVec to a nice, no size overhead implementation by
default. It is understood this is not a perfect solution! We do the best
we can with stable Rust, and will adopt std's MaybeUninit or equivalent
as soon as it becomes available.

ArrayString can already in this PR use what's been designated as the
sound solution, since its backing array is Copy. See MaybeUninitCopy in the PR.

I think we can enable the future stable MaybeUninit equivalent
conditionally, automatically, and don't need user visible API changes for that.

This removes nodrop as a dependency, finally.

Look here for previous discussion: #76 (comment)

Closes #86
Fixes #62

@Thomasdezeeuw
Copy link
Contributor

I really like the design of RFC #1892, which defines a new type MaybeUninit:

union MaybeUninit<T> {
    uninit: (),
    value: T,
}

I think that would be a great fit for ArrayVec use case, something like this:

pub struct ArrayVec<A: Array, T> {
    data: A<Item = MaybeUninit<T>>,
    len: A::Index,
}

@RalfJung
Copy link
Contributor

Hm, notice that for ManuallyDrop<T> we will likely want to decide that it always has to contain at least a "bit-valid" T. That's required for unsized ManuallyDrop in combination with thin dyn types. So, the ManuallyDrop here is likely not correct.

Is it possible to have a union of a ManuallyDrop<T> and a ()? That would be safe, I think.

@bluss
Copy link
Owner Author

bluss commented Jul 21, 2018

Thanks for the pointers!

@bluss
Copy link
Owner Author

bluss commented Nov 25, 2018

@RalfJung I'll do the best I can with stable Rust here and insert MaybeUninit as soon as it is stable (with a version check). A union is not possible since we want to have non-Copy T:

error[E0658]: unions with non-`Copy` fields are unstable (see issue #32836)
  --> src/maybe_uninit.rs:15:1
   |
15 | / union Test<T> {
16 | |     empty: (),
17 | |     value: ManuallyDrop<T>,
18 | | }
   | |_^
   |
   = help: add #![feature(untagged_unions)] to the crate attributes to enable

@bluss bluss changed the title Implement a careful “MaybeUninit” using ManuallyDrop Implement a stop-gap “MaybeUninit” using ManuallyDrop Nov 25, 2018
@bluss
Copy link
Owner Author

bluss commented Nov 25, 2018

@RalfJung It's long ago now that I discussed this with @arielb1, and we came into discussion about the difference between "moved from" (like ptr::read or ManuallyDrop::drop would leave the interior. [Put in nuance if those are different]) and wholly uninitialized.

Current ManuallyDrop::drop basically says it leaves the interior uninitialized. So in that sense this type can already contain this.

@bluss
Copy link
Owner Author

bluss commented Nov 25, 2018

another problem from ArrayString: it is Copy, and we'd like to be able to Copy it while the array is partially or wholly uninitialized.

This is a cautious version of MaybeUninit, since we don't have one in
libstd, based on ManuallyDrop.

This moves ArrayVec to a nice, no size overhead implementation by
default. We use Rust version sniffing (build script) to automatically
use this for new enough Rust versions.

This doesn't kill nodrop unfortunately, it still remains as the
fallback.
The nodrop/fallback code is as it was, it uses the Array trait accessors
as_ptr/as_ptr_mut; but we really want to do this properly with a raw
pointer cast according to discussion with arielb.

The idea is that we don't take a reference to a partially uninitialized
array value -- we just extract a raw pointer to the first array element
instead.
We have ManuallyDrop as the fallback since Rust 1.20, and don't need
nodrop anymore.
…ring

This is the "real" union solution, and ArrayString can use it since its
backing array is Copy. Unfortunately, we'll have to use the Copy bound
on the type, making it "viral".
This isn't of much use at the moment, but future Rust features could
mean we can use it.
Raw pointer taking should go through the MaybeUninit wrappers around the
arrays anyway, when it is partially uninitialized, which it often is.

The remaining .as_ptr() and .as_slice() methods on Array is only used
on a fully initialized array in ArrayString::from_byte_string
impl<T> MaybeUninit<T> {
/// Create a new MaybeUninit with uninitialized interior
pub unsafe fn uninitialized() -> Self {
Self::from(uninitialized())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this is not correct -- it is as incorrect as just plain mem::uninitialized.

Using terminology from my blog post, the issue with mem::uninitialized is that it returns data that violates the validity invariant. While ManuallyDrop can handle data that is already dropped, it does not have any effect on the validity invariant: ManuallyDrop<bool> may only be 0x0 or 0x1, and hence

let x: MaybeUninit<bool> = MaybeUninit::uninitialized();

with your type is UB.

For non-Copy types, there is currently no way to do this. We should really work towards solving the remaining open questions in rust-lang/rust#53491!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW; I know this is not correct, this is the current best effort with stable Rust like the PR says.

The current state of arrayvec as it is released, can't be said to be any better than this, I'm sure?

Copy link
Owner Author

@bluss bluss Dec 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung With MaybeUninit, what's the importance of using MaybeUninit<[T; 10]> vs [MaybeUninit<T>; 10] ? The latter I don't even have clear picture of how to initialize, when we start with T values that are not Copy.

If I'd follow the now-removed array_vec.rs in librustc_data_structures, it uses mem::uninitialized() to initialize a whole value of [MaybeUninit<T>; 10]. Which takes us close to be back at the starting point? Or is that fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current state of arrayvec as it is released, can't be said to be any better than this, I'm sure?

I do not know that state. This is pretty much equivalent to using mem::uninitialized directly.

it uses mem::uninitialized() to initialize a whole value of [MaybeUninit; 10].

oO I had no idea.^^ Where can I find its sources?

Time that we get mem::uninitialized deprecated so that these things are rejected by -D warnings.

But given that MaybeUninit is a union, it is actually okay to not initialize it. The problem with mem::uninitialized is that it almost always violates the validity invariant of the type it returns; but for MaybeUninit, the validity invariant accepts literally any data (including uninitialized data), and hence there is no problem with that particular usage of mem::uninitialized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at ArrayVec when it got removed in rust-lang/rust@687cc29, it uses mem::uninitialized on ManuallyDrop, which is not okay.

Copy link
Contributor

@RalfJung RalfJung Dec 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MaybeUninit, what's the importance of using MaybeUninit<[T; 10]> vs [MaybeUninit; 10] ?

The two types are the same, in the sense that you may transmute between them. However, their different type has effects on e.g. into_inner, which for MaybeUninit<[T; 10]> asserts that all elements are initialized.

Copy link
Owner Author

@bluss bluss Dec 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! By the way I'm following the stacked borrows discussion and unsafe code guidelines now. It will have a big impact on ndarray, I believe. Just like the rest of the ecosystem we are gradually getting a clearer picture of what we can do in unsafe code and can dot our i's and cross the t's here.

I would go with using MaybeUninit<[T; 10]> and offsetting into it, because the other way around leads to code with more transmutes when getting initialized [T; 10] wrapped into the ArrayVec. We of course never use anything like into_inner except when we know it is fully initialized.

Given this, it sounds like it is ok to use mem::uninitialized with [MaybeUninit<T>; 10], even if the array is the outer type. That's the only answer to how to set a value to such a field that I can see (or equivalent union trick).

@eddyb
Copy link

eddyb commented Dec 2, 2018

A union is not possible since we want to have non-Copy T:

I thought we were going to allow ManuallyDrop<T>, alongside any Copy types, as the property we're interested in isn't really Copy, but "needs no drop".

Did we not stabilize that? If so, why?

@RalfJung
Copy link
Contributor

RalfJung commented Dec 2, 2018

Did we not stabilize that? If so, why?

We accepted the RFC. It hasn't been implemented yet, though.

@bluss
Copy link
Owner Author

bluss commented Dec 3, 2018

This is confirmed broken, it's something that has changed in practice since this was first prototyped using Rust 1.20.

Now that ManuallyDrop<T> switched to struct, its interior is no longer exempt from being used enum layout optimization, and this makes an arrayvec based on ManuallyDrop noticeably unsound, and smallvec has the same problem.

Testing the current PR:

#[test]
fn test_still_works_with_option_arrayvec() {
    type RefArray = ArrayVec<[&'static i32; 2]>;
    let array = Some(RefArray::new());
    assert!(array.is_some());
    println!("{:?}", array);
}

The test fails in debug mode, it passes in release mode but the output is "None", which is incorrect.

See also servo/rust-smallvec/issues/126 for smallvec

This seems like a trivial test, but since it can fail, it shows us that
we don't have a sound implementation yet.
@RalfJung
Copy link
Contributor

RalfJung commented Dec 3, 2018

The fact that ManuallyDrop was exempt from layout optimizations was an implementation limitation, it was never a guarantee.

@bluss
Copy link
Owner Author

bluss commented Dec 3, 2018

@RalfJung I get this. It's a regression in a popular crate (smallvec) so it needs to be taken seriously as a practical problem at some level.

@RalfJung
Copy link
Contributor

RalfJung commented Dec 3, 2018

Agreed, it is a problem. However, the relevant change happened several months ago and got released on stable with 1.29, 10 weeks ago.

I think the main problem here is people relying on things that just "happen to be the case". I am not sure how to prevent this, other than repeating all the time "don't do that", and working on stabilizing APIs that let people better express their intent.

@bluss
Copy link
Owner Author

bluss commented Dec 3, 2018

@RalfJung It's great that things are becoming clearer on the unsafe code guidelines front, and Rust as a whole is really maturing. This crate is clearly a quite old and big hack, but it has kept afloat pragmatically since it fills a need, just like smallvec.

I think it's good to approach it with pragmatism, what's the best solution right now and how do we bring these up on “dry land”. It's looking like we are close now.

I don't quite agree with your "happen to be the case", this PR was prepared after long discussions for example with arielb and we made some good guesses about where ManuallyDrop would go at the time. And we were wrong!

And the original crate was written long before anyone could answer what was and wasn't allowed, it was just unclear.

@RalfJung
Copy link
Contributor

RalfJung commented Dec 3, 2018

IOW, we should work hard towards stabilizing MaybeUninit :D

And the original crate was written long before anyone could answer what was and wasn't allowed, it was just unclear.

Relying on unclear things is exactly what I meant. It's dangerous.

But I also appreciate that so many things are unclear still that it can be hard to avoid.

@bluss
Copy link
Owner Author

bluss commented Dec 3, 2018

@RalfJung 😄 rust-lang/rust#56466 The old sins are disappearing

@bluss
Copy link
Owner Author

bluss commented Dec 15, 2018

See PR #114 for another pragmatic mitigation—fighting the fire where we can—conditional use of unions and a good solution (only on nightly!)

@bluss
Copy link
Owner Author

bluss commented Dec 18, 2018

This is replaced by #114 , #115, #116 (which only affects nightly users so far). We'll do the best we can with stable, as soon as we are able.

@bluss bluss closed this Dec 18, 2018
@bluss bluss deleted the maybe-uninit branch December 18, 2018 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants