Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NLL] Dangly paths for box #52782

Merged
merged 6 commits into from
Aug 2, 2018

Conversation

pnkfelix
Copy link
Member

@pnkfelix pnkfelix commented Jul 27, 2018

Special-case Box in rustc_mir::borrow_check.

Since we know dropping a box will not access any &mut or & references, it is safe to model its destructor as only touching the contents owned by the box.


There are three main things going on here:

  1. The first main thing, this PR is fixing a bug in NLL where rustc previously would issue a diagnostic error in a case like this:
fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x } 

such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message (playground)

error[E0597]: `**x` does not live long enough
 --> src/main.rs:3:40
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  |                                        ^^^^^^^^ - `**x` dropped here while still borrowed
  |                                        |
  |                                        borrowed value does not live long enough
  |
note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 3:1...
 --> src/main.rs:3:1
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error
  1. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue Drop and leaking &mut references #31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with Scribble model this, showing that the compiler now catches such unsoundness.

However, that fix for issue #31567 is too strong, in that NLL (MIR-borrowck) includes Box as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue #45696, specifically in the last example of this comment, which I have adapted into the fn foo shown above.

We did close issue #45696 back in December of 2017, but AFAICT that example was not fixed by PR #46268. (And we did not include a test, etc etc.)

This PR fixes that case, by trying to model the so-called DerefPure semantics of Box<T> when we traverse the type of the input to visit_terminator_drop.

  1. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of Box<T> could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like struct A(Box<A>);, something which cannot be constructed in practice.)

Fix #45696.

@rust-highfive
Copy link
Collaborator

r? @estebank

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 27, 2018
@pnkfelix
Copy link
Member Author

r? @nikomatsakis

@pnkfelix
Copy link
Member Author

pnkfelix commented Jul 27, 2018

Btw this PR illustrates a second breaking change to the static semantics associated with the move to NLL (the first breaking change being #27282), and thus provides extra justification for the migration mode (#46908) we are deploying with the 2018 edition.

let ty = erased_drop_place_ty.boxed_ty();
debug!("visit_terminator_drop drop-box-content deref_place: {:?} ty: {:?}",
deref_place, ty);
self.visit_terminator_drop(loc, term, flow_state, &deref_place, ty, span);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like unconditional recursion when dropping struct A(Box<A>, ...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note you didn’t put an Option<Box<A>> there

Doesn’t the compiler already reject such infinite types? Let me check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm okay this may indeed be a problem for me. You can define such types. You won’t be able to make an instance of one, but I imagine one could make a let a: Option<A> = None; and then the compiler would possibly infinite loop while trying to deal with the drop of the Option<A> ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way to get a local with such a type is as a function argument or unitialized variable for now. Option<A> is currently dropped as a single unit, since it's a lot harder to reborrow an enum field, which is the only case this matters as far as I can tell.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there are already existing bugs where infinitely sized types can cause the compiler, often specifically dropck, to hang.

Here are a couple (that are still open/unresolved) that I found by skimming over the results when I searched for "recursive" in our issues list: #44933, and #52852

(Plus there is #4287, which I might argue that example falls into in the long term, since our users might like getting a nice warning about the problem.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it turns out that the infinite loop here does allocate memory on each iteration, so we will eventually exhaust our available memory and error out due to that. I don't know whether people think that is better or worse than just diverging execution... :)

Despite the existence of bugs like #44933 and #52852, I will go ahead and try to fix this PR to detect the recursion...

Copy link
Member Author

@pnkfelix pnkfelix Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I had a slew of comments that falsely claimed that rustc under AST-borrowck also diverged on the case in question; but I had gotten confused because I was also checking the behavior of #52852, so I thought the infinite loops I was seeing there also applied to this struct A(Box<A>); case; I have now deleted those erroneous comments since they were just injecting a lot of noise on the PR.)

I will definitely fix this PR. This is a case that used to work and my PR causes it to break in a pretty bad way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fixed, and added corresponding test.)

@pnkfelix
Copy link
Member Author

I'm going to assign this review to @nagisa since @nikomatsakis is on PTO.

@nagisa : do feel free to reassign to someone else if you do not feel comfortable with it. I could just let someone from the WG-compiler-nll working group look at it (indeed it seems like @matthewjasper has done at least enough of a review to point out something deep that I overlooked).

@pnkfelix
Copy link
Member Author

r? @nagisa

@rust-highfive rust-highfive assigned nagisa and unassigned nikomatsakis Jul 30, 2018

// indirect non-regular recursion with indirect ownership via box.
struct G { field: (F, F) }
struct H { field: Box<E> }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines seem to be incorrect, there is no actual recursion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there is, in a sense that you cannot construct either E or F due to them being recursive. It is just that they are not mutually recursive, but perhaps ought to be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a typo. I meant for G to refer to H and H to refer to G.

(I had gone through a couple iterations of this and I added cases above... maybe I'll put these into individual inner modules to try to prevent this kind of mistake.)

Copy link
Member

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over most of the code, but couldn’t finish looking over the tests. I’ll delegate to @eddyb to r+ this, as I believe they looked over the tests and/or semantics.

@@ -830,6 +835,37 @@ impl InitializationRequiringAction {
}
}

// (This is a simple linked-list threaded up the stack of recursive calls in visit_terminator_drop)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this between parens, and why is this not a proper doc?

mir.visit_terminator_drop(loc, term, flow_state, &place, field_ty, span);
debug!("visit_terminator_drop drop_field place: {:?} field_ty: {:?}", place, field_ty);
let seen = SeenTy(erased_drop_place_ty, prev_seen);
mir.visit_terminator_drop(loc, term, flow_state, &place, field_ty, span, Some(&seen));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense and seems good.

// such a deref.
//
// (We use a shallow write because a Deep one would
// touch `&mut` references in `T`.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, although I’m straining to have this not fly over my head entirely :)


// indirect non-regular recursion with indirect ownership via box.
struct G { field: (F, F) }
struct H { field: Box<E> }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there is, in a sense that you cannot construct either E or F due to them being recursive. It is just that they are not mutually recursive, but perhaps ought to be?

@@ -830,6 +835,37 @@ impl InitializationRequiringAction {
}
}

/// A simple linked-list threaded up the stack of recursive calls in `visit_terminator_drop`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @nikomatsakis this is starting to quickly become the rustc team's favorite hack!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, in the commit message I even wrote:

Note: A similar style stack-only linked-list definition can be found
in `rustc_mir::borrow_check::places_conflict`. It might be good at
some point in the future to unify the two types and put the resulting
definition into `librustc_data_structures/`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -830,6 +835,37 @@ impl InitializationRequiringAction {
}
}

/// A simple linked-list threaded up the stack of recursive calls in `visit_terminator_drop`.
#[derive(Debug)]
struct SeenTy<'a, 'gcx: 'a>(ty::Ty<'gcx>, Option<&'a SeenTy<'a, 'gcx>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Ty not ty::Ty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I guess we've hit the number of occurrences where we can add Ty to the use at the top.

@@ -830,6 +835,37 @@ impl InitializationRequiringAction {
}
}

/// A simple linked-list threaded up the stack of recursive calls in `visit_terminator_drop`.
#[derive(Debug)]
struct SeenTy<'a, 'gcx: 'a>(ty::Ty<'gcx>, Option<&'a SeenTy<'a, 'gcx>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can rearrange this to have Option around a pair of Ty and &SeenTy, bypassing the problems needing Option around it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I will try that.

// references it holds (which a Deep Write would touch).
ty::TyAdt(def, _) if def.is_box() => {
// First, we model the loss of the reference itself
// via a shallow write.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the loss of the reference itself"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the dropping of the representation of the box-pointer

How would you prefer to phrase this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the allocation? Or the library-side newtype around the pointer? I would not call either a "reference" unless you mean the &mut Box<T> that a Drop::drop impl would get.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the library-side newtype around the pointer. I tried rephrasing as "StorageDead of the box-pointer representation" in my updated comments.

// Third, we model dropping any content owned by the
// box by recurring on box contents. This catches
// cases like `Box<Box<ScribbleWhenDropped<&mut T>>>`,
// while still restricting Write to *owned* content.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this last? Doesn't dropping the contents happen before deallocating the box?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want the drop of the box contents to happen first, that should be easy to do. For some reason I thought the presentation was more "natural" from the view point of someone reading the code.

  • I didn't think the order in which we did these three steps would actually matter ...
  • (but now you've got me wondering if it could ... I'll do some experiments...)

I do remember playing with moving around the order to see if that got rid of the deltas to dropck.nll.stderr and dropck-object-cycle.nll.stderr. (But it didn't.)

So, anyway, I'll try putting it first as part of other changes I'm making in response to your feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One argument for why the order shouldn't matter here: Given a_box: Box<T>, In theory someone should be allowed to move the contents out of *a_box, and have those contents get dropped at a time that is disconnected from the (shallow) drop of a_box.

Of course that would be totally broken if the drop of a_box did attempt to dereference and drop the contents at *a_box. But that's why the shallow/deep distinction is important here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but now you've got me wondering if it could ... I'll do some experiments...)

i played around a little to try to see if I could create a struct with a recursive self-reference inside a box and somehow throw in a destructor that would allow us to in some way observe the access order here. But perhaps unsurprisingly I was unable to make such a thing due to dropck rules...

// testsuite complained when I tried leaving it out.)
self.access_place(
ContextKind::Drop.new(loc),
(drop_place, span),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why do we need to do anything to the dereference, as opposed to the Box itself?
If you implemented Box in a library, you would recurse in an ADT and see a raw pointer... and not go further.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and that's the whole problem.

If you implement MyBox as a library, then we see that that there is an impl Drop for MyBox and then fail to accept code like:

fn foo(x: MyBox<&mut i32>) -> &mut i32 { &mut **x } 

If you want to be able to implement MyBox as a library, then we will need to generalize the functionality described here via some means (be it DerefPure or #[dangly_path] or something else), and then have MyBox use that generalized functionality.

This PR is taking the expedient path of treating Box as a special case where we know its destructor is special.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is the shallow *x access even necessary? Doesn't owned *x + shallow x suffice?
Also, isn't this similar to unsafe impl<#[may_dangle] T> Drop for Box<T>?

Copy link
Member Author

@pnkfelix pnkfelix Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think I understand your question now.

The shallow *x access is definitely necessary. If you leave that out, you miss cases like a drop of a_box: Box<i32>, where the drop of the box needs to invalidate borrows of &*a_box and &mut *a_box

Notably, the shallow x access does not invalidate such borrows. (And the step that recurs on the box contents doesn't help there either, because when the original type was Box<i32> then we're just looking at i32...)

Because all the shallow x access touches is the fields of the box representation:

pub struct Box<T: ?Sized>(Unique<T>);

i.e.:

rust/src/libcore/ptr.rs

Lines 2694 to 2702 in f898179

pub struct Unique<T: ?Sized> {
pointer: NonZero<*const T>,
// NOTE: this marker has no consequences for variance, but is necessary
// for dropck to understand that we logically own a `T`.
//
// For details, see:
// https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md#phantom-data
_marker: PhantomData<T>,
}

Copy link
Member Author

@pnkfelix pnkfelix Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One might be able to argue that the special case handling here should be attached to Unique<T> rather than Box<T>. I haven't thought terribly hard about that option.

(But that would be much harder to implement in the short term. Right now in MIR-borrowck we do at least * track Box and derefs of Box as special things. If we had to somehow use UniquePtr instead then that would involve revisiting our code for representing the move-paths.)

Copy link
Member Author

@pnkfelix pnkfelix Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the recursive call to visit_terminator_drop will bottom out here at this code:

_ => {
// We have now refined the type of the value being
// dropped (potentially) to just the type of a
// subfield; so check whether that field's type still
// "needs drop". If so, we assume that the destructor
// may access any data it likes (i.e., a Deep Write).
if erased_drop_place_ty.needs_drop(gcx, self.param_env) {
self.access_place(
ContextKind::Drop.new(loc),
(drop_place, span),
(Deep, Write(WriteKind::StorageDeadOrDrop)),
LocalMutationIsAllowed::Yes,
flow_state,
);
}

I.e. visit_terminator_drop is just meant to model the actions of destructors. If you hand it the place *a_box, that is just an i32. Dropping that has no effect.

So my shallow write to *a_box is an attempt to model the effect of the Box<T> destructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddyb asked:

Also, isn't this similar to unsafe impl<#[may_dangle] T> Drop for Box?

Yes it is. That is in part why I called these things "dangly paths" in nikomatsakis/nll-rfc#40

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so the "owned drop" part of *a_box does not invalidate borrows to *a_box for at least some types, that helps a bit.

I wish the "shallow write" done to*a_box was more of a StorageDead effect (instead of just being called a "write"), because it's really destroying the backing storage, without necessarily writing to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Further review/discussion led to me pointing out that the effect encoded here is inded tagged as Write(WriteKind::StorageDeadOrDrop))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also PR #54310

@pnkfelix
Copy link
Member Author

pnkfelix commented Jul 31, 2018

Note: One entirely reasonable path would be to not land this PR in time for 2008 Edition Preview 2 (EP2).

That is, we could try to get a guess as to how much code in the wild is relying on this, by seeing how many users report that they got what they see as a bogus warning from the NLL code in a scenario like this one.

I mention this mainly because while reading over nikomatsakis/nll-rfc#40 I was reminded that we never really "decided" that we were going to support this.

@@ -980,10 +980,27 @@ impl<'cx, 'gcx, 'tcx> MirBorrowckCtxt<'cx, 'gcx, 'tcx> {
// dropping a box does not touch data behind any
// references it holds (which a Deep Write would touch).
ty::TyAdt(def, _) if def.is_box() => {
// There are three main steps modelled here for
// dropping `a_box: Box<T`>`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray backtick after T.

// When/if we add a `&own T` type, step 1 would be
// like running the destructor of the `&own T` (and
// the owner of backing storage referenced by the
// `&own T` would be responsible for steps 2 and 3).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the lowest (unsafe?) level, yes, step 2 would technically be in Drop for Box<T>.
However, from the point of view of the code using DerefOwn + Drop, step 2 would be part of dropping the &own T.

That is, dropping &own T invalidates it as backing storage for the code using it, so there would potentially be a gap in between dropping the &own T you got from DerefOwn and calling free (via Drop for Box<T>), in which you can't use the allocation even though it still exists.

This is quite distinct from moving out of the &own T, which would allow moving back in, or assigning to the T inside, which would drop the T but not invalidate the &own T.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so you’re interpreting step 2 as solely invalidating the backing storage, while I was interpreting it as both invalidating the backing storage and deallocating the backing storage

Do I have that right? Where does the deallocation happen in your mental model, as part of step 3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, rereading my comment in the code, maybe I was interpreting step 2 as solely deallocating the backing storage, and I didn’t have the invalidation in any explicit step.

While you are pointing out that the invalidation happens, in the code here, as part of step 2

Copy link
Member

@eddyb eddyb Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are pointing out that the invalidation happens, in the code here, as part of step 2

Yes, in terms of borrowck, you can only observe that "backing storage" invalidation (see below).

Do I have that right? Where does the deallocation happen in your mental model, as part of step 3?

Yes, it's in step 3, and it's not something the borrowck can even model directly, since it doesn't understand the dynamic allocation protocol (the deallocation is "just" an FFI call with a raw pointer).

Copy link
Member

@eddyb eddyb Aug 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking to @RalfJung, I reached a point where I'm wondering whether Drop(x: T) being a noop when !needs_drop(T) is a bad idea, since it makes some sense that it'd have an effect similar to Operand::Move(x) (i.e. invalidation of the source value).

I thought something like StorageDead(*a_box) was needed, but Move(*a_box) should do and Drop(*x) should maybe imply it as well.

For the &own T model, that means that Drop(x: &own T) only needs to invalidate the value at *x, not the underlying memory. However, we should have the same restrictions overall, as that memory is no longer reachable, since x had the unique path to it (which the Drop(x) should have invalidated), and invalidating the value at *x should prevent any other borrows from surviving.

In less words, we only need StorageLive / StorageDead for locals specifically because they're not behind indirection we can create/destroy in ways that would enable/disable access to the value through a single unique path (with subordinate borrows).

OTOH we don't generate Drop at all sometimes, as an optimization, so we have to rely on StorageDead instead for invalidating at least some of the locals.

// FIXME: Could we just get by with just steps 1 and 2
// above? (This last step seems like a good thing to
// do just as a matter of principle; but none of our
// testsuite complained when I tried leaving it out.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's relevant if part of liballoc did &a_box.0 (for whatever reason) - without this last step, a_box could be dropped without that borrow being invalidated, right? You could probably use a #![no_std] test to check for this, by definition your own Box lang item.

This should address issue 45696.

Since we know dropping a box will not access any `&mut` or `&`
references, it is safe to model its destructor as only touching the
contents *owned* by the box.

Note: At some point we may want to generalize this machinery to other
reference and collection types that are "pure" in the same sense as
box. If we add a `&move` reference type, it would probably also fall
into this branch of code. But for the short term, we will be
conservative and restrict this change to `Box<T>` alone.

The code works by recursively descending a deref of the `Box`. We
prevent `visit_terminator_drop` infinite-loop (which can arise in a
very obscure scenario) via a linked-list of seen types.

Note: A similar style stack-only linked-list definition can be found
in `rustc_mir::borrow_check::places_conflict`. It might be good at
some point in the future to unify the two types and put the resulting
definition into `librustc_data_structures/`.

----

One final note: Review feedback led to significant simplification of
logic here.

During review, eddyb RalfJung and I uncovered the heart of why I
needed a so-called "step 2" aka the Shallow Write to the Deref of the
box. It was because the `visit_terminator_drop`, in its base case,
will not emit any write at all (shallow or deep) to a place unless
that place has a need_drop.

So I was encoding a Shallow Write by hand for a `Box<T>`, as a
separate step from recursively descending through `*a_box` (which was
at the time known as "step 1"; it is now the *only* step, apart from
the change to the base case for `visit_terminator_drop` that this
commit now has encoded).

eddyb aruged that *something* should be emitting some sort of write in
the base case here (even a shallow one), of the dropped place, since
by analogy we also emit a write when you *move* a place. That led
to the revision here in this commit.

 * (Its possible that this desired write should be attached in some
   manner to StorageDead instead of Drop. But in this PR, I tried to
   leave the StorageDead logic alone and focus my attention solely on
   how Drop(x) is modelled in MIR-borrowck.)
(Presumably the place that borrow_check ends up reporting for the
error about is no longer the root `Local` itself, and thus the note
diagnostic here stops firing.)
…ve cases.

After talking about the PR with eddyb, I decided it was best to try to
have some test cases that simplify the problem down to its core, so
that people trying to understand what the issue is here will see those
core examples first.
@pnkfelix pnkfelix force-pushed the issue-45696-dangly-paths-for-box branch from be0f888 to c02c00b Compare August 1, 2018 15:45
@eddyb
Copy link
Member

eddyb commented Aug 2, 2018

@bors r+

I hope @nikomatsakis agrees and we can revert it otherwise.

@bors
Copy link
Contributor

bors commented Aug 2, 2018

📌 Commit c02c00b has been approved by eddyb

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 2, 2018
// cases cannot actually arise, it is sound for us to just
// skip them during drop. If the developer uses unsafe
// code to construct them, they should not be surprised by
// weird drop behavior in their resulting code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, can we not fall-back-to-error instead of fall-back-to-accept? Seems like a disaster waiting to happen.^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot fallback to error, at least not to hard error. That could break hypothetical code.

We could issue a future compatibility warning.

Or we could try to emit drop code that error dynamically ... maybe that is what you are asking for?

(When I wrote "skip them during drop", I meant during this static analysis. I believe the actual dynamic behavior here will be to ... infinite loop ... i think... )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is the test case showing the kind of case that concerns me. It is currently marked as run-pass; we could make a variant of it that actually conjures up one of this infinitely regressed things, but I'm pretty sure that the unsafe code guidelines group would tag such a test as having undefined behavior, no?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that the unsafe code guidelines group would tag such a test as having undefined behavior,

Yeah, I agree.

When I wrote "skip them during drop", I meant during this static analysis. I believe the actual dynamic behavior here will be to ... infinite loop ... i think...

Probably it'll fill up the stack. ;)
What does it mean for the static analysis to skip this? Does it not consider this a use, or so? That seems problematic.^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It considers the first one it sees a use, but not the second one encountered via the recursive descent.

But to even get to that code during dynamic execution you’d have to already built the invalid thing and be witnessing UB.

Maybe I should make a more complete illustrative test case for us to discuss, with the aforementioned construction...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It considers the first one it sees a use, but not the second one encountered via the recursive descent.

I see. Yes, without knowing any of the details, this does make some sense.

cramertj added a commit to cramertj/rust that referenced this pull request Aug 2, 2018
…or-box, r=eddyb

[NLL] Dangly paths for box

Special-case `Box` in `rustc_mir::borrow_check`.

Since we know dropping a box will not access any `&mut` or `&` references, it is safe to model its destructor as only touching the contents *owned* by the box.

----

There are three main things going on here:

1. The first main thing, this PR is fixing a bug in NLL where `rustc` previously would issue a diagnostic error in a case like this:
```rust
fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
```

such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message ([playground](https://play.rust-lang.org/?gist=13c5560f73bfb16d6dab3ceaad44c0f8&version=nightly&mode=release&edition=2015))
```
error[E0597]: `**x` does not live long enough
 --> src/main.rs:3:40
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  |                                        ^^^^^^^^ - `**x` dropped here while still borrowed
  |                                        |
  |                                        borrowed value does not live long enough
  |
note: borrowed value must be valid for the anonymous lifetime rust-lang#1 defined on the function body at 3:1...
 --> src/main.rs:3:1
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error
```

2. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue rust-lang#31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with `Scribble` model this, showing that the compiler now catches such unsoundness.

However, that fix for issue rust-lang#31567 is too strong, in that NLL (MIR-borrowck) includes `Box` as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue rust-lang#45696, specifically in [the last example of this comment](rust-lang#45696 (comment)), which I have adapted into the `fn foo` shown above.

We did close issue rust-lang#45696 back in December of 2017, but AFAICT that example was not fixed by PR rust-lang#46268. (And we did not include a test, etc etc.)

This PR fixes that case, by trying to model the so-called `DerefPure` semantics of `Box<T>` when we traverse the type of the input to `visit_terminator_drop`.

3. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of `Box<T>` could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like `struct A(Box<A>);`, something which cannot be constructed in practice.)

Fix rust-lang#45696.
cramertj added a commit to cramertj/rust that referenced this pull request Aug 2, 2018
…or-box, r=eddyb

[NLL] Dangly paths for box

Special-case `Box` in `rustc_mir::borrow_check`.

Since we know dropping a box will not access any `&mut` or `&` references, it is safe to model its destructor as only touching the contents *owned* by the box.

----

There are three main things going on here:

1. The first main thing, this PR is fixing a bug in NLL where `rustc` previously would issue a diagnostic error in a case like this:
```rust
fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
```

such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message ([playground](https://play.rust-lang.org/?gist=13c5560f73bfb16d6dab3ceaad44c0f8&version=nightly&mode=release&edition=2015))
```
error[E0597]: `**x` does not live long enough
 --> src/main.rs:3:40
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  |                                        ^^^^^^^^ - `**x` dropped here while still borrowed
  |                                        |
  |                                        borrowed value does not live long enough
  |
note: borrowed value must be valid for the anonymous lifetime rust-lang#1 defined on the function body at 3:1...
 --> src/main.rs:3:1
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error
```

2. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue rust-lang#31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with `Scribble` model this, showing that the compiler now catches such unsoundness.

However, that fix for issue rust-lang#31567 is too strong, in that NLL (MIR-borrowck) includes `Box` as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue rust-lang#45696, specifically in [the last example of this comment](rust-lang#45696 (comment)), which I have adapted into the `fn foo` shown above.

We did close issue rust-lang#45696 back in December of 2017, but AFAICT that example was not fixed by PR rust-lang#46268. (And we did not include a test, etc etc.)

This PR fixes that case, by trying to model the so-called `DerefPure` semantics of `Box<T>` when we traverse the type of the input to `visit_terminator_drop`.

3. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of `Box<T>` could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like `struct A(Box<A>);`, something which cannot be constructed in practice.)

Fix rust-lang#45696.
@bors
Copy link
Contributor

bors commented Aug 2, 2018

⌛ Testing commit c02c00b with merge 40cb447...

bors added a commit that referenced this pull request Aug 2, 2018
…ddyb

[NLL] Dangly paths for box

Special-case `Box` in `rustc_mir::borrow_check`.

Since we know dropping a box will not access any `&mut` or `&` references, it is safe to model its destructor as only touching the contents *owned* by the box.

----

There are three main things going on here:

1. The first main thing, this PR is fixing a bug in NLL where `rustc` previously would issue a diagnostic error in a case like this:
```rust
fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
```

such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message ([playground](https://play.rust-lang.org/?gist=13c5560f73bfb16d6dab3ceaad44c0f8&version=nightly&mode=release&edition=2015))
```
error[E0597]: `**x` does not live long enough
 --> src/main.rs:3:40
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  |                                        ^^^^^^^^ - `**x` dropped here while still borrowed
  |                                        |
  |                                        borrowed value does not live long enough
  |
note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 3:1...
 --> src/main.rs:3:1
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error
```

2. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue #31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with `Scribble` model this, showing that the compiler now catches such unsoundness.

However, that fix for issue #31567 is too strong, in that NLL (MIR-borrowck) includes `Box` as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue #45696, specifically in [the last example of this comment](#45696 (comment)), which I have adapted into the `fn foo` shown above.

We did close issue #45696 back in December of 2017, but AFAICT that example was not fixed by PR #46268. (And we did not include a test, etc etc.)

This PR fixes that case, by trying to model the so-called `DerefPure` semantics of `Box<T>` when we traverse the type of the input to `visit_terminator_drop`.

3. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of `Box<T>` could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like `struct A(Box<A>);`, something which cannot be constructed in practice.)

Fix #45696.
@bors
Copy link
Contributor

bors commented Aug 2, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: eddyb
Pushing 40cb447 to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MIR-borrowck: behavior around Box<T> is different from AST borrowck
10 participants