Reading from the return place is fine #71005

jonas-schievink · 2020-04-10T20:14:36Z

Const eval thinks that reading from local _0 is UB, but it isn't. _0 is just a normal local like any other, and codegen handles it that way too. The only special thing is that the Return terminator will read from it.

I've hit these errors while working on an NRVO pass that can merge other locals with _0 in #71003.

r? @oli-obk

eddyb · 2020-04-10T20:21:27Z

cc @rust-lang/wg-const-eval

RalfJung · 2020-04-10T22:17:21Z

Ah, this is an ooooold assertion in Miri, and I wondered about it multiple times.

But I am also a bit worried that there might be some things Miri is doing that actually rely on this being UB...

For once, in a function call _res = f(...), Miri currently directly makes _res the return place of f. There is no extra copy or so. I am a bit worried that if the return place can be used like any other local, this might cause weird aliasing-related issues. I honestly do not know what a MIR specification should even say here about how return places are used on function calls -- should there be a temporary and the final result is copied after the call returned, or should there be direct access? Function arguments get copied, for example.

Also, if the return type is a ZST, is it okay for the return place to be "dangling" or even NULL? (Miri currently avoids actually allocating memory for the return place in some situations where we know the return type to be ().)

src/librustc_mir/interpret/operand.rs

eddyb · 2020-04-10T22:24:15Z

@RalfJung My understand is that something like _7 = call foo(move _7) is UB.
And I suppose _7 = call foo(copy _7) wouldn't be.

That is, a moved place passed as an argument and the return place are not allowed to alias, because/so that they may be passed by reference to the callee.

miri can do the same thing:

(optional) enforce that the places passed down don't overlap (not sure if Stacked Borrows wants to check e.g. references to the return place that may be in some argument)
pass both move arguments and the return place "by reference"

RalfJung · 2020-04-11T09:48:15Z

That is, a moved place passed as an argument and the return place are not allowed to alias, because/so that they may be passed by reference to the callee.

But what about

let _ptr = &raw mut _7;
_7 = foo(ptr);

What exactly is foo allowed to do with ptr, how is that supposed to be checked by Miri, and for the allowed cases what is the correct behavior? Ideally there should be no crazy special cases for this in Miri.

(All of this will be a nightmare to test as I assume we cannot even generate such MIR.)

(optional) enforce that the places passed down don't overlap (not sure if Stacked Borrows wants to check e.g. references to the return place that may be in some argument)

That sounds like a special hack needs to be added to a bunch of places. So far, Miri has exactly one explicit check for whether things overlap, and that is to validate copy_nonoverlaping. The "mutable references may not overlap" is descriptive, not normative -- it arises as an emergent property from Stacked Borrows.

I feel something is very deeply wrong with any proposal that requires explicit overlap checks.

The easiest answer, I think, is to say that _tmp = foo(...) allocates some memory (just like we do for any other local of foo, _0 is currently the only local that we do not do this for), so that the return place definitely does not alias with anything. (We have no syntax for &_0, right?) Then, once foo returned, we copy the content of the return place to _tmp. That is easy to implement and even gets rid of a special hack that we currently need.

However, I do not know whether this accurately models our LLVM codegen.

bjorn3 · 2020-04-11T12:43:32Z

However, I do not know whether this accurately models our LLVM codegen.

If the return value can't be returned in one or more registers, the a pointer to the return place is passed to the function as argument and then any writes to _0 will directly write to the place the return value of the call should be stored.

Function prologue:

rust/src/librustc_codegen_ssa/mir/mod.rs

Lines 204 to 208 in 1f3b659

    
           if local == mir::RETURN_PLACE && fx.fn_abi.ret.is_indirect() { 
        
               debug!("alloc: {:?} (return place) -> place", local); 
        
               let llretptr = bx.get_param(0); 
        
               return LocalRef::Place(PlaceRef::new_sized(llretptr, layout)); 
        
           }

Call terminator codegen:

rust/src/librustc_codegen_ssa/mir/block.rs

Lines 1171 to 1183 in 1f3b659

    
           if fn_ret.is_indirect() { 
        
               if dest.align < dest.layout.align.abi { 
        
                   // Currently, MIR code generation does not create calls 
        
                   // that store directly to fields of packed structs (in 
        
                   // fact, the calls it creates write only to temps). 
        
                   // 
        
                   // If someone changes that, please update this code path 
        
                   // to create a temporary. 
        
                   span_bug!(self.mir.span, "can't directly store to unaligned value"); 
        
               } 
        
               llargs.push(dest.llval); 
        
               ReturnDest::Nothing 
        
           } else {

Return terminator codegen:

rust/src/librustc_codegen_ssa/mir/block.rs

Lines 257 to 260 in 1f3b659

    
           PassMode::Ignore | PassMode::Indirect(..) => { 
        
               bx.ret_void(); 
        
               return; 
        
           }

eddyb · 2020-04-11T13:23:56Z

We have no syntax for &_0, right?

_0 is just another local, so yeah, we do, and we want to.

The goal of NRVO is to optimize this to pass the return value pointer to fill:

// Something large.
struct Buf([u8; 128]);
fn fill(buf: &mut Buf) {...}
fn make() -> Buf {
    let mut buf = Buf::new();
    fill(&mut buf);
    buf
}

eddyb · 2020-04-11T13:26:22Z

@RalfJung I think a good way to (conservatively) model our codegen is to &mut-borrow (well, &own, but I think you can get away without that) the return place and all of the arguments, at the callsite, and use dereferences of those borrows in the callee, as _0 and the arg locals.

RalfJung · 2020-04-13T09:47:00Z

@eddyb

I think a good way to (conservatively) model our codegen is to &mut-borrow (well, &own, but I think you can get away without that) the return place and all of the arguments, at the callsite, and use dereferences of those borrows in the callee, as _0 and the arg locals.

Arguments are operands, not places, so I don't know what it means the &mut-borrow them.
But what I could certainly do is add some code that, at the beginning of a function call, performs a mutable re-borrow of the return place, with a protector. This should mean that any access with another pointer is insta-UB.

However, I am somewhat dissatisfied with how much of a special-case-hack this is -- typically we make reborrowing explicit in MIR, but here, that is not possible because w can only access the return place, but never access it as a reference.

RalfJung · 2020-04-13T09:48:45Z

An alternative would be to actually do this "allocate a dedicated return place" dance, consistently with any other local. You said yourself _0 is just another local, but any solution you proposed still continues to special-case it (see also the extra case distinction for RETURN_PLACE that this PR still needs to make).

The question then is how to justify codegen. Basically codegen would have to ensure that the temporary that is used for the return place never had its address taken. Currently that is the case, but I am not sure if it always will be.

src/librustc_mir/interpret/operand.rs

eddyb · 2020-04-13T10:36:54Z

Arguments are operands, not places, so I don't know what it means the &mut-borrow them

TerminatorKind::Call arguably abuses Operand::Move(place) to pass the place (i.e. as if it it were doing borrows, specifically the hypothetical &own/&move).

Operand::Copy and Operand::Const behave as expected.

RalfJung · 2020-04-13T11:38:24Z

TerminatorKind::Call arguably abuses Operand::Move(place) to pass the place (i.e. as if it it were doing borrows, specifically the hypothetical &own/&move).

Wait, are you saying that foo(move _2) is suppose to pass _2 by reference? That is not what Miri does, and frankly if that is the intended behavior then it should be represented differently. An Operand, by definition, can only be read from, it cannot be passed by reference.

eddyb · 2020-04-13T12:13:25Z

Wait, are you saying that foo(move _2) is suppose to pass _2 by reference?

Yuuupp, although not necessarily in a way in which the caller can observe changes to _2.

I have been annoyed by Operand being used in cases when it's really Place (in the Operand::Move case, although sometimes even Operand::Copy might behave like that, just not for calls, there we make a temporary copy, I think?).

Remember #68364? These two are basically equivalent (assuming identity just does _0 = _1):

dest = move src;

dest = call identity(move src);

In that for non-Scalar/Vector/ScalarPair layouts, only one memcpy is performed and dest and src can't overlap.

eddyb · 2020-04-13T12:18:48Z

Basically codegen would have to ensure that the temporary that is used for the return place never had its address taken.

For MIR optimizations like NRVO it's important that calls can write the output directly into some variable, that may be borrowed before/after the call (checking the borrows don't overlap the calls at optimization time, not at codegen time).

RalfJung · 2020-04-13T12:31:19Z

Yuuupp, although not necessarily in a way in which the caller can observe changes to _2.

Oookay... I think we need an issue to figure out the semantics of MIR function calls, because I think Miri does not currently do what you say the semantics are.

For MIR optimizations like NRVO it's important that calls can write the output directly into some variable, that may be borrowed before/after the call (checking the borrows don't overlap the calls at optimization time, not at codegen time).

Sure, but that's an optimization. There is no problem with this as long as it behaves observably entirely equivalent to reserving some dedicated memory for the return place, and copying later.

RalfJung · 2020-04-13T12:34:24Z

These two are basically equivalent (assuming identity just does _0 = _1):

I understand that's what they lower to in LLVM, but I am not convinced that they should actually be equivalent operations on the MIR level.

Now this makes me wonder if we could also use such a "mutable reborrow" trick for assignments to avoid an explicit overlap check? (However, layout.abi really shouldn't affect the rules here, that would be quite the layering violation.)

eddyb · 2020-04-13T12:34:47Z

There is no problem with this as long as it behaves observably entirely equivalent to reserving some dedicated memory for the return place, and copying later.

Yes, I was talking specifically about the restriction you suggested, and that I quoted, being undesirable.

RalfJung · 2020-04-13T12:37:09Z

Yes, I was talking specifically about the restriction you suggested, and that I quoted, being undesirable.

Complicating MIR semantics is also undesirable though, at least if we want to also use this to precisely capture Rust semantics.

I am beginning to wonder if we should have "simple MIR" and "optimized MIR" with two distinct semantics, where the former is designed for simplicity and used as the target of MIR building, and then we prove that for what MIR building creates, both semantics are equivalent, and then go on using optimized MIR semantics with awful things like sharing the return place between caller and callee.

eddyb · 2020-04-13T12:38:18Z

However, layout.abi really shouldn't affect the rules here, that is an awful layering violation.

Yeah we can be conservative and treat overlaps are errors in miri even codegen wouldn't introduce UB.
So far all of the "data optimizations" I've worked on (SROA/NRVO) don't care about "scalar vs aggregate" at all, and always assumes there's no "immediate load/store", only bulk memory accesses.

RalfJung · 2020-04-14T09:07:43Z

Let us continue the semantic discussion in #71117.

@jonas-schievink for your PR, I think if we want to make _0 be more like a normal local then we should not add more RETURN_PLACE hacks like this here does, and instead we should try to get rid of the return_place field in Frame.

I left a note in the linked issue for how that could be done:

We can probably get rid of this special treatment if we are okay with losing the "immediate value" optimization for return places; then we could force_allocate the caller-provided return place and make the callee _0 an Indirect local. (This would entirely remove return_place from Frame, which is good I think.)

I think that would be preferable over the approach you are currently pursuing. Let me know if you need any help. There is a chance of this costing performance, so we should perf-test it -- but in terms of code simplification I think this would totally be worth it.

jonas-schievink · 2020-04-15T21:31:02Z

Not sure if this is what you had in mind (it doesn't get rid of return_place), but I've pushed a commit that treats RETURN_PLACE more like a regular local, and copies it to the return_place when leaving a stack frame.

RalfJung · 2020-04-15T23:00:14Z

Not sure if this is what you had in mind (it doesn't get rid of return_place), but I've pushed a commit that treats RETURN_PLACE more like a regular local, and copies it to the return_place when leaving a stack frame.

No that's not what I had in mind -- but it is an interesting idea. This now does both Stacked-Borrows-style exclusive reservation of the caller-provided return place, and a separate copy. The separate copy helps simplify Miri but reserving the caller-provided return place should mean that eliding the copy during lowering should always be sound... @eddyb what do you think?

src/librustc_mir/interpret/eval_context.rs

jonas-schievink · 2020-04-18T16:08:03Z

I am assuming you solved the const-prop trouble?

Those disappeared after a rebase, which I don't quite understand. It does seem to work fine now though, yeah.

bors · 2020-04-18T18:43:28Z

☀️ Try build successful - checks-azure
Build commit: ab35b24df7cbdea1c9d064420530a62e86efacaf (ab35b24df7cbdea1c9d064420530a62e86efacaf)

rust-timer · 2020-04-18T18:43:30Z

Queued ab35b24df7cbdea1c9d064420530a62e86efacaf with parent 339a938, future comparison URL.

rust-timer · 2020-04-18T23:02:19Z

Finished benchmarking try commit ab35b24df7cbdea1c9d064420530a62e86efacaf, comparison URL.

jonas-schievink · 2020-04-18T23:23:46Z

Hmm, looks like ctfe-stress-4 regresses by up to 8%. This might be acceptable since it's a stress test, but it would be nice to improve that somehow.

However, since I expect this to be caused by the fact that we always allocate at least one local now, I'm not quite sure how (ideas welcome!).

Copy its value to the `return_place` upon leaving a call frame

Mostly renamed allocations, but I'm not sure about the const prop tests

RalfJung · 2020-04-20T22:08:00Z

@jonas-schievink so here's one idea, or a direction at least. With this PR, we actually allocate the return place for every constant twice: once at

rust/src/librustc_mir/const_eval/eval_queries.rs

Line 33 in 8ce3f84

let ret = ecx.allocate(layout, MemoryKind::Stack);

and once in push_stack_frame. We should find a way to avoid that.

oli-obk · 2020-04-21T15:01:39Z

src/librustc_mir/interpret/eval_context.rs

+            // Copy the return value to the caller's stack frame.
+            if let Some(return_place) = frame.return_place {
+                let op = self.access_local(&frame, mir::RETURN_PLACE, None)?;
+                self.copy_op_transmute(op, return_place)?;


Not sure if we can pull this off, but maybe we can (for non-immediates) make the return place alias with the frame's RETURN_PLACE local. This would be doable by changing the LocalState of the RETURN_PLACE to Live(Operand::Indirect(return_place)) (well you still need to convert the https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/interpret/struct.PlaceTy.html to an MPlaceTy, but that should be doable (and where not, we have an immediate, which is cheap and we can just keep copying it here)

Yeah, I was wondering the same thing. We need to be careful then when popping the frame to not do copy_op (as that would overlap itself) but still do validation.

The main subtlety here is making sure this is a Miri-internal optimization that cannot be observed by the program. The fact that we reborrow the return place should exclude any observations by accessing the return place directly... so I think the only remaining possible observation is through address equality tests.

So... about this PR. I'd be fine merging it with the 8% regression on the stress tests (with an open issue for fixing it again). Then we can experiment with this local aliasing trick in a PR that does nothing but the optimization.

That would be nice. I don't think I have the bandwidth for these deeper changes to const eval.

Is that something we should nominate for T-compiler decision?

I think @rust-lang/wg-const-eval can make this decision on our own, since it's just a stress test being regressed. Regressing the test is not an issue, we just have the test to make sure we don't do so accidentally

Okay, fine for me. @ecstatic-morse what do you think?

@jonas-schievink can you create an issue describing the perf regression and linking to this discussion?

Opened #71463

oli-obk · 2020-04-23T12:29:42Z

@bors r+

This PR will regress perf by around 8% on a ctfe stress test, because we now allocate return values twice and copy them over on function return. I consider this acceptable, and we have #71463 open to make sure we don't forget to fix the regression

bors · 2020-04-23T12:29:43Z

📌 Commit 415fd0c has been approved by oli-obk

@ghost

Rollup of 6 pull requests Successful merges: - rust-lang#71005 (Reading from the return place is fine) - rust-lang#71198 (Const check/promotion cleanup and sanity assertion) - rust-lang#71396 (Improve E0308 error message wording again) - rust-lang#71452 (Remove outdated reference to interpreter snapshotting) - rust-lang#71454 (Inline some function docs in `core::ptr`) - rust-lang#71461 (Improve E0567 explanation) Failed merges: r? @ghost

@jonas-schievink

…chievink add back Scalar::null_ptr We were a bit overeager with removing this in rust-lang#71005, Miri uses this function quite a bit. The important part is that the `Place::null` methods are gone. :) r? @jonas-schievink @oli-obk Fixes rust-lang#71474

rust-highfive assigned oli-obk Apr 10, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 10, 2020

RalfJung reviewed Apr 10, 2020

View reviewed changes

src/librustc_mir/interpret/operand.rs Outdated Show resolved Hide resolved

eddyb mentioned this pull request Apr 11, 2020

[WIP] NRVO pass for acyclic CFGs #71003

Closed

RalfJung reviewed Apr 13, 2020

View reviewed changes

src/librustc_mir/interpret/operand.rs Outdated Show resolved Hide resolved

This was referenced Apr 13, 2020

Semantics of MIR assignments, around aliasing, ordering, and primitives. #68364

Open

Semantics of MIR function calls #71117

Open

This comment has been minimized.

Sign in to view

RalfJung reviewed Apr 15, 2020

View reviewed changes

src/librustc_mir/interpret/eval_context.rs Show resolved Hide resolved

This comment has been minimized.

Sign in to view

jonas-schievink added 10 commits April 20, 2020 21:18

Treat RETURN_PLACE as a normal Local

4eaf535

Copy its value to the `return_place` upon leaving a call frame

Remove null places

a417f96

Fix pop_stack_frame logic

34ed891

Dump return_place upon returning

0fda0fd

Update const prop

c5bfbb6

Fix codegen and mir-opt tests

a5c1851

Mostly renamed allocations, but I'm not sure about the const prop tests

Remove unnecessary block

f0ab469

Bless 32-bit test output

1ce6e6a

Use copy_op_transmute

9e6f38a

const prop: don't special case return place

415fd0c

oli-obk reviewed Apr 21, 2020

View reviewed changes

jonas-schievink mentioned this pull request Apr 23, 2020

Address CTFE perf regression introduced by #71005 #71463

Open

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2020

Dylan-DPC-zz mentioned this pull request Apr 23, 2020

Rollup of 6 pull requests #71467

Merged

bors merged commit 61fbc6a into rust-lang:master Apr 23, 2020

RalfJung mentioned this pull request Apr 23, 2020

add back Scalar::null_ptr #71479

Merged

ecstatic-morse mentioned this pull request May 15, 2020

Dumb NRVO #72205

Merged

Reading from the return place is fine #71005

Reading from the return place is fine #71005

Conversation

jonas-schievink commented Apr 10, 2020

eddyb commented Apr 10, 2020

RalfJung commented Apr 10, 2020 • edited Loading

eddyb commented Apr 10, 2020

RalfJung commented Apr 11, 2020

bjorn3 commented Apr 11, 2020

eddyb commented Apr 11, 2020

eddyb commented Apr 11, 2020

RalfJung commented Apr 13, 2020

RalfJung commented Apr 13, 2020

eddyb commented Apr 13, 2020

RalfJung commented Apr 13, 2020

eddyb commented Apr 13, 2020

eddyb commented Apr 13, 2020

RalfJung commented Apr 13, 2020 • edited Loading

RalfJung commented Apr 13, 2020 • edited Loading

eddyb commented Apr 13, 2020

RalfJung commented Apr 13, 2020

eddyb commented Apr 13, 2020

RalfJung commented Apr 14, 2020 • edited Loading

jonas-schievink commented Apr 15, 2020

This comment has been minimized.

RalfJung commented Apr 15, 2020

jonas-schievink commented Apr 18, 2020

bors commented Apr 18, 2020

rust-timer commented Apr 18, 2020

rust-timer commented Apr 18, 2020

jonas-schievink commented Apr 18, 2020

This comment has been minimized.

This comment has been minimized.

RalfJung commented Apr 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oli-obk commented Apr 23, 2020

bors commented Apr 23, 2020

RalfJung commented Apr 10, 2020 •

edited

Loading

RalfJung commented Apr 13, 2020 •

edited

Loading

RalfJung commented Apr 13, 2020 •

edited

Loading

RalfJung commented Apr 14, 2020 •

edited

Loading