Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading from the return place is fine #71005

Merged
merged 10 commits into from
Apr 23, 2020
Merged

Reading from the return place is fine #71005

merged 10 commits into from
Apr 23, 2020

Conversation

jonas-schievink
Copy link
Contributor

Const eval thinks that reading from local _0 is UB, but it isn't. _0 is just a normal local like any other, and codegen handles it that way too. The only special thing is that the Return terminator will read from it.

I've hit these errors while working on an NRVO pass that can merge other locals with _0 in #71003.

r? @oli-obk

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 10, 2020
@eddyb
Copy link
Member

eddyb commented Apr 10, 2020

cc @rust-lang/wg-const-eval

@RalfJung
Copy link
Member

RalfJung commented Apr 10, 2020

Ah, this is an ooooold assertion in Miri, and I wondered about it multiple times.

But I am also a bit worried that there might be some things Miri is doing that actually rely on this being UB...

For once, in a function call _res = f(...), Miri currently directly makes _res the return place of f. There is no extra copy or so. I am a bit worried that if the return place can be used like any other local, this might cause weird aliasing-related issues. I honestly do not know what a MIR specification should even say here about how return places are used on function calls -- should there be a temporary and the final result is copied after the call returned, or should there be direct access? Function arguments get copied, for example.

Also, if the return type is a ZST, is it okay for the return place to be "dangling" or even NULL? (Miri currently avoids actually allocating memory for the return place in some situations where we know the return type to be ().)

@eddyb
Copy link
Member

eddyb commented Apr 10, 2020

@RalfJung My understand is that something like _7 = call foo(move _7) is UB.
And I suppose _7 = call foo(copy _7) wouldn't be.

That is, a moved place passed as an argument and the return place are not allowed to alias, because/so that they may be passed by reference to the callee.

miri can do the same thing:

  1. (optional) enforce that the places passed down don't overlap (not sure if Stacked Borrows wants to check e.g. references to the return place that may be in some argument)
  2. pass both move arguments and the return place "by reference"

@RalfJung
Copy link
Member

That is, a moved place passed as an argument and the return place are not allowed to alias, because/so that they may be passed by reference to the callee.

But what about

let _ptr = &raw mut _7;
_7 = foo(ptr);

What exactly is foo allowed to do with ptr, how is that supposed to be checked by Miri, and for the allowed cases what is the correct behavior? Ideally there should be no crazy special cases for this in Miri.

(All of this will be a nightmare to test as I assume we cannot even generate such MIR.)

(optional) enforce that the places passed down don't overlap (not sure if Stacked Borrows wants to check e.g. references to the return place that may be in some argument)

That sounds like a special hack needs to be added to a bunch of places. So far, Miri has exactly one explicit check for whether things overlap, and that is to validate copy_nonoverlaping. The "mutable references may not overlap" is descriptive, not normative -- it arises as an emergent property from Stacked Borrows.

I feel something is very deeply wrong with any proposal that requires explicit overlap checks.

The easiest answer, I think, is to say that _tmp = foo(...) allocates some memory (just like we do for any other local of foo, _0 is currently the only local that we do not do this for), so that the return place definitely does not alias with anything. (We have no syntax for &_0, right?) Then, once foo returned, we copy the content of the return place to _tmp. That is easy to implement and even gets rid of a special hack that we currently need.

However, I do not know whether this accurately models our LLVM codegen.

@bjorn3
Copy link
Member

bjorn3 commented Apr 11, 2020

However, I do not know whether this accurately models our LLVM codegen.

If the return value can't be returned in one or more registers, the a pointer to the return place is passed to the function as argument and then any writes to _0 will directly write to the place the return value of the call should be stored.

Function prologue:

if local == mir::RETURN_PLACE && fx.fn_abi.ret.is_indirect() {
debug!("alloc: {:?} (return place) -> place", local);
let llretptr = bx.get_param(0);
return LocalRef::Place(PlaceRef::new_sized(llretptr, layout));
}

Call terminator codegen:

if fn_ret.is_indirect() {
if dest.align < dest.layout.align.abi {
// Currently, MIR code generation does not create calls
// that store directly to fields of packed structs (in
// fact, the calls it creates write only to temps).
//
// If someone changes that, please update this code path
// to create a temporary.
span_bug!(self.mir.span, "can't directly store to unaligned value");
}
llargs.push(dest.llval);
ReturnDest::Nothing
} else {

Return terminator codegen:

PassMode::Ignore | PassMode::Indirect(..) => {
bx.ret_void();
return;
}

@eddyb
Copy link
Member

eddyb commented Apr 11, 2020

We have no syntax for &_0, right?

_0 is just another local, so yeah, we do, and we want to.

The goal of NRVO is to optimize this to pass the return value pointer to fill:

// Something large.
struct Buf([u8; 128]);
fn fill(buf: &mut Buf) {...}
fn make() -> Buf {
    let mut buf = Buf::new();
    fill(&mut buf);
    buf
}

@eddyb
Copy link
Member

eddyb commented Apr 11, 2020

@RalfJung I think a good way to (conservatively) model our codegen is to &mut-borrow (well, &own, but I think you can get away without that) the return place and all of the arguments, at the callsite, and use dereferences of those borrows in the callee, as _0 and the arg locals.

@RalfJung
Copy link
Member

@eddyb

I think a good way to (conservatively) model our codegen is to &mut-borrow (well, &own, but I think you can get away without that) the return place and all of the arguments, at the callsite, and use dereferences of those borrows in the callee, as _0 and the arg locals.

Arguments are operands, not places, so I don't know what it means the &mut-borrow them.
But what I could certainly do is add some code that, at the beginning of a function call, performs a mutable re-borrow of the return place, with a protector. This should mean that any access with another pointer is insta-UB.

However, I am somewhat dissatisfied with how much of a special-case-hack this is -- typically we make reborrowing explicit in MIR, but here, that is not possible because w can only access the return place, but never access it as a reference.

@RalfJung
Copy link
Member

An alternative would be to actually do this "allocate a dedicated return place" dance, consistently with any other local. You said yourself _0 is just another local, but any solution you proposed still continues to special-case it (see also the extra case distinction for RETURN_PLACE that this PR still needs to make).

The question then is how to justify codegen. Basically codegen would have to ensure that the temporary that is used for the return place never had its address taken. Currently that is the case, but I am not sure if it always will be.

@eddyb
Copy link
Member

eddyb commented Apr 13, 2020

Arguments are operands, not places, so I don't know what it means the &mut-borrow them

TerminatorKind::Call arguably abuses Operand::Move(place) to pass the place (i.e. as if it it were doing borrows, specifically the hypothetical &own/&move).

Operand::Copy and Operand::Const behave as expected.

@RalfJung
Copy link
Member

TerminatorKind::Call arguably abuses Operand::Move(place) to pass the place (i.e. as if it it were doing borrows, specifically the hypothetical &own/&move).

Wait, are you saying that foo(move _2) is suppose to pass _2 by reference? That is not what Miri does, and frankly if that is the intended behavior then it should be represented differently. An Operand, by definition, can only be read from, it cannot be passed by reference.

@eddyb
Copy link
Member

eddyb commented Apr 13, 2020

Wait, are you saying that foo(move _2) is suppose to pass _2 by reference?

Yuuupp, although not necessarily in a way in which the caller can observe changes to _2.

I have been annoyed by Operand being used in cases when it's really Place (in the Operand::Move case, although sometimes even Operand::Copy might behave like that, just not for calls, there we make a temporary copy, I think?).

Remember #68364? These two are basically equivalent (assuming identity just does _0 = _1):

dest = move src;
dest = call identity(move src);

In that for non-Scalar/Vector/ScalarPair layouts, only one memcpy is performed and dest and src can't overlap.

@eddyb
Copy link
Member

eddyb commented Apr 13, 2020

Basically codegen would have to ensure that the temporary that is used for the return place never had its address taken.

For MIR optimizations like NRVO it's important that calls can write the output directly into some variable, that may be borrowed before/after the call (checking the borrows don't overlap the calls at optimization time, not at codegen time).

@RalfJung
Copy link
Member

RalfJung commented Apr 13, 2020

Yuuupp, although not necessarily in a way in which the caller can observe changes to _2.

Oookay... I think we need an issue to figure out the semantics of MIR function calls, because I think Miri does not currently do what you say the semantics are.

For MIR optimizations like NRVO it's important that calls can write the output directly into some variable, that may be borrowed before/after the call (checking the borrows don't overlap the calls at optimization time, not at codegen time).

Sure, but that's an optimization. There is no problem with this as long as it behaves observably entirely equivalent to reserving some dedicated memory for the return place, and copying later.

@RalfJung
Copy link
Member

RalfJung commented Apr 13, 2020

These two are basically equivalent (assuming identity just does _0 = _1):

I understand that's what they lower to in LLVM, but I am not convinced that they should actually be equivalent operations on the MIR level.

Now this makes me wonder if we could also use such a "mutable reborrow" trick for assignments to avoid an explicit overlap check? (However, layout.abi really shouldn't affect the rules here, that would be quite the layering violation.)

@eddyb
Copy link
Member

eddyb commented Apr 13, 2020

There is no problem with this as long as it behaves observably entirely equivalent to reserving some dedicated memory for the return place, and copying later.

Yes, I was talking specifically about the restriction you suggested, and that I quoted, being undesirable.

@RalfJung
Copy link
Member

Yes, I was talking specifically about the restriction you suggested, and that I quoted, being undesirable.

Complicating MIR semantics is also undesirable though, at least if we want to also use this to precisely capture Rust semantics.

I am beginning to wonder if we should have "simple MIR" and "optimized MIR" with two distinct semantics, where the former is designed for simplicity and used as the target of MIR building, and then we prove that for what MIR building creates, both semantics are equivalent, and then go on using optimized MIR semantics with awful things like sharing the return place between caller and callee.

@eddyb
Copy link
Member

eddyb commented Apr 13, 2020

However, layout.abi really shouldn't affect the rules here, that is an awful layering violation.

Yeah we can be conservative and treat overlaps are errors in miri even codegen wouldn't introduce UB.
So far all of the "data optimizations" I've worked on (SROA/NRVO) don't care about "scalar vs aggregate" at all, and always assumes there's no "immediate load/store", only bulk memory accesses.

@RalfJung
Copy link
Member

RalfJung commented Apr 14, 2020

Let us continue the semantic discussion in #71117.

@jonas-schievink for your PR, I think if we want to make _0 be more like a normal local then we should not add more RETURN_PLACE hacks like this here does, and instead we should try to get rid of the return_place field in Frame.

I left a note in the linked issue for how that could be done:

We can probably get rid of this special treatment if we are okay with losing the "immediate value" optimization for return places; then we could force_allocate the caller-provided return place and make the callee _0 an Indirect local. (This would entirely remove return_place from Frame, which is good I think.)

I think that would be preferable over the approach you are currently pursuing. Let me know if you need any help. There is a chance of this costing performance, so we should perf-test it -- but in terms of code simplification I think this would totally be worth it.

@jonas-schievink
Copy link
Contributor Author

Not sure if this is what you had in mind (it doesn't get rid of return_place), but I've pushed a commit that treats RETURN_PLACE more like a regular local, and copies it to the return_place when leaving a stack frame.

@rust-highfive

This comment has been minimized.

@RalfJung
Copy link
Member

Not sure if this is what you had in mind (it doesn't get rid of return_place), but I've pushed a commit that treats RETURN_PLACE more like a regular local, and copies it to the return_place when leaving a stack frame.

No that's not what I had in mind -- but it is an interesting idea. This now does both Stacked-Borrows-style exclusive reservation of the caller-provided return place, and a separate copy. The separate copy helps simplify Miri but reserving the caller-provided return place should mean that eliding the copy during lowering should always be sound... @eddyb what do you think?

@jonas-schievink
Copy link
Contributor Author

I am assuming you solved the const-prop trouble?

Those disappeared after a rebase, which I don't quite understand. It does seem to work fine now though, yeah.

@bors
Copy link
Contributor

bors commented Apr 18, 2020

☀️ Try build successful - checks-azure
Build commit: ab35b24df7cbdea1c9d064420530a62e86efacaf (ab35b24df7cbdea1c9d064420530a62e86efacaf)

@rust-timer
Copy link
Collaborator

Queued ab35b24df7cbdea1c9d064420530a62e86efacaf with parent 339a938, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit ab35b24df7cbdea1c9d064420530a62e86efacaf, comparison URL.

@jonas-schievink
Copy link
Contributor Author

Hmm, looks like ctfe-stress-4 regresses by up to 8%. This might be acceptable since it's a stress test, but it would be nice to improve that somehow.

However, since I expect this to be caused by the fact that we always allocate at least one local now, I'm not quite sure how (ideas welcome!).

@bors

This comment has been minimized.

@bors

This comment has been minimized.

@RalfJung
Copy link
Member

@jonas-schievink so here's one idea, or a direction at least. With this PR, we actually allocate the return place for every constant twice: once at

let ret = ecx.allocate(layout, MemoryKind::Stack);

and once in push_stack_frame. We should find a way to avoid that.

// Copy the return value to the caller's stack frame.
if let Some(return_place) = frame.return_place {
let op = self.access_local(&frame, mir::RETURN_PLACE, None)?;
self.copy_op_transmute(op, return_place)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we can pull this off, but maybe we can (for non-immediates) make the return place alias with the frame's RETURN_PLACE local. This would be doable by changing the LocalState of the RETURN_PLACE to Live(Operand::Indirect(return_place)) (well you still need to convert the https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/interpret/struct.PlaceTy.html to an MPlaceTy, but that should be doable (and where not, we have an immediate, which is cheap and we can just keep copying it here)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering the same thing. We need to be careful then when popping the frame to not do copy_op (as that would overlap itself) but still do validation.

The main subtlety here is making sure this is a Miri-internal optimization that cannot be observed by the program. The fact that we reborrow the return place should exclude any observations by accessing the return place directly... so I think the only remaining possible observation is through address equality tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... about this PR. I'd be fine merging it with the 8% regression on the stress tests (with an open issue for fixing it again). Then we can experiment with this local aliasing trick in a PR that does nothing but the optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice. I don't think I have the bandwidth for these deeper changes to const eval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that something we should nominate for T-compiler decision?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @rust-lang/wg-const-eval can make this decision on our own, since it's just a stress test being regressed. Regressing the test is not an issue, we just have the test to make sure we don't do so accidentally

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fine for me. @ecstatic-morse what do you think?

@jonas-schievink can you create an issue describing the perf regression and linking to this discussion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #71463

@oli-obk
Copy link
Contributor

oli-obk commented Apr 23, 2020

@bors r+

This PR will regress perf by around 8% on a ctfe stress test, because we now allocate return values twice and copy them over on function return. I consider this acceptable, and we have #71463 open to make sure we don't forget to fix the regression

@bors
Copy link
Contributor

bors commented Apr 23, 2020

📌 Commit 415fd0c has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 23, 2020
Rollup of 6 pull requests

Successful merges:

 - rust-lang#71005 (Reading from the return place is fine)
 - rust-lang#71198 (Const check/promotion cleanup and sanity assertion)
 - rust-lang#71396 (Improve E0308 error message wording again)
 - rust-lang#71452 (Remove outdated reference to interpreter snapshotting)
 - rust-lang#71454 (Inline some function docs in `core::ptr`)
 - rust-lang#71461 (Improve E0567 explanation)

Failed merges:

r? @ghost
@bors bors merged commit 61fbc6a into rust-lang:master Apr 23, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Apr 23, 2020
…chievink

add back Scalar::null_ptr

We were a bit overeager with removing this in rust-lang#71005, Miri uses this function quite a bit.

The important part is that the `Place::null` methods are gone. :)
r? @jonas-schievink @oli-obk

Fixes rust-lang#71474
@ecstatic-morse ecstatic-morse mentioned this pull request May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants