Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial future-proofing for Box<T, A> #50097

Merged
merged 5 commits into from
Apr 27, 2018
Merged

Conversation

glandium
Copy link
Contributor

@glandium glandium commented Apr 20, 2018

In some ways, this is similar to @eddyb's PR #47043 that went stale, but doesn't cover everything. Notably, this still leaves Box internalized as a pointer in places, so practically speaking, only ZSTs can be practically added to the Box type with the changes here (the compiler ICEs otherwise).

The Box type is not changed here, that's left for the future because I want to test that further first, but this puts things in place in a way that hopefully will make things easier.

Currently, MIR just passes the raw Box to box_free(), which happens to
work because practically, it's the same thing. But that might not be
true in the future, with Box<T, A: Alloc>.

The MIR inline pass actually fixes up the argument while inlining
box_free, but this is not enabled by default and doesn't necessarily
happen (the inline threshold needs to be passed).

This change effectively moves what the MIR inline pass does to the
elaborate_drops pass, so that box_free() is passed the raw pointer
instead of the Box.
Because box_free is now passed a pointer instead of a Box, we can stop
relying on TypeChecked::check_box_free_inputs, because
TypeChecker::check_call_inputs should be enough, like for all other
function calls.

It seems it was not actually reached anyways in cases where it would
have made a difference. (issue rust-lang#50071)
box_free currently takes a pointer. With the prospect of the Box type
definition changing in the future to include an allocator, box_free will
also need to be aware of this. In order to prepare for that future, we
allow box_free to take a form where its argument are the fields of the
Box.

e.g. if Box is defined as `Box(A, B, C)`, then box_free signature
becomes `box_free(a: A, b: B, c: C)`.

We however still allow the current form (taking a pointer), so that the
same compiler can handle both forms, which helps with bootstrap.
…meters

A Box type with associated allocator would, on its own, be a backwards
incompatible change, because of the additional parameter, but if that
additional parameter has a default, then backwards compatibility with
the current definition of the type is preserved.

But the owned_box lang item currently doesn't allow such extra
parameters, so add support for this.
@glandium
Copy link
Contributor Author

@bors try

@bors
Copy link
Contributor

bors commented Apr 20, 2018

@glandium: 🔑 Insufficient privileges: not in try users

@Mark-Simulacrum
Copy link
Member

@bors try

@bors
Copy link
Contributor

bors commented Apr 20, 2018

⌛ Trying commit 64f5233 with merge 349c1c0fa7d4522250306a50b4828a911650a414...

@bors
Copy link
Contributor

bors commented Apr 20, 2018

☀️ Test successful - status-travis
State: approved= try=True

@pietroalbini pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 20, 2018
@pietroalbini
Copy link
Member

Thanks for the PR! Highfive is currently not working, so I'm assigning a reviewer from the compiler team randomly.

@glandium
Copy link
Contributor Author

FWIW, with these patches, it's also possible to use owned_box and box_free on the Box type provided in my allocator_api crate (with a few additions to the crate that I'll probably commit when this lands; and light testing indicates it works fine).

@glandium
Copy link
Contributor Author

@eddyb since you were assigned to this, you may notice the code looks familiar. Interestingly, I wrote most of it before you pointed me in the direction of PR #47043, so this is reassuring to me that I independently came up with similar code, considering that was my first time in the compiler guts.

@TimNN
Copy link
Contributor

TimNN commented Apr 24, 2018

Ping from triage @eddyb ! This PR needs your review.

let free_inputs = free_sig.inputs();
// If the box_free function takes a *mut T, transform the Box into
// such a pointer before calling box_free. Otherwise, pass it all
// the fields in the Box as individual arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think supporting the *mut T form adds unnecessary clutter to the compiler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only temporary clutter, and really makes the transition easier. For example, this could land now in 1.27, and the liballoc changes could be done in 1.28 without having a cfg(stage0)-specific Box implementation. /That/ would be clutter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[cfg(stage0)] box_free is tiny compared to this, since all it needs to do is call the real implementation, which it can do in much less code. (See my patch for more details)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other problem is that a #[cfg(stage0)] box_free with the old signature can't be compiled by the new code, which means if, say, version 1.27 has the new code, you can't bootstrap 1.27.1 (if there ends up being one) with 1.27, which, typically, linux distros would be doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehm, we don't support that - or rather, you set a flag in config.toml that turns off stage0. cc @alexcrichton

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we pretty much always use #[cfg(stage0)] like this)

As of now, Box only contains a Unique pointer, so this is the sole
argument to box_free. Consequently, we remove the code supporting
the previous box_free signature. We however keep the old definition
for bootstrapping purpose.
@eddyb
Copy link
Member

eddyb commented Apr 25, 2018

@glandium Please update the PR description as well.
LGTM but this is pretty close to my own old patch, so r? @nikomatsakis

@rust-highfive rust-highfive assigned nikomatsakis and unassigned eddyb Apr 25, 2018
@kennytm
Copy link
Member

kennytm commented Apr 25, 2018

@glandium Are we supposed to crater this?

@glandium
Copy link
Contributor Author

@kennytm there is no change of exposed APIs backwards compatible or otherwise, so I don't think this needs one.

@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Apr 25, 2018

📌 Commit bd8c177 has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 25, 2018
@bors
Copy link
Contributor

bors commented Apr 27, 2018

⌛ Testing commit bd8c177 with merge 71d3dac...

bors added a commit that referenced this pull request Apr 27, 2018
Partial future-proofing for Box<T, A>

In some ways, this is similar to @eddyb's PR #47043 that went stale, but doesn't cover everything. Notably, this still leaves Box internalized as a pointer in places, so practically speaking, only ZSTs can be practically added to the Box type with the changes here (the compiler ICEs otherwise).

The Box type is not changed here, that's left for the future because I want to test that further first, but this puts things in place in a way that hopefully will make things easier.
@bors
Copy link
Contributor

bors commented Apr 27, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing 71d3dac to master...

@alexcrichton
Copy link
Member

This pr had an up to 10% negative effect on wall time of some benchmarks, has that been looked into or is it known what's going on there?

@glandium
Copy link
Contributor Author

Looking at the worst offender, issue-46449, a profile shows that the time spent in llvm is larger (which makes sense considering how essentially only --release builds have been affected). I isolated the regression to the very last commit in the series, and here's the catch: the MIR is strictly identical between with and without that commit. So, somehow, changing the argument to box_free from a *mut T to a Unique<T> is the reason for llvm doing more work. Sadly, it's spending all that time for nothing, because the generated code is stricly identical. The differences in llvm-ir are, unsurprisingly, around box_free:

before:

; alloc::alloc::box_free
; Function Attrs: inlinehint nounwind uwtable
define internal fastcc void @_ZN5alloc5alloc8box_free17h74d9b2058bbfc89bE(%"std::io::error::Custom"* %ptr) unnamed_addr #3 {
start:
  %0 = bitcast %"std::io::error::Custom"* %ptr to i8*
  tail call void @__rust_dealloc(i8* %0, i64 24, i64 8) #9
  ret void
} 

; alloc::alloc::box_free
; Function Attrs: inlinehint nounwind uwtable
define internal fastcc void @_ZN5alloc5alloc8box_free17h95ac39da6ae7db02E({}* %ptr.0, {}* noalias nocapture nonnull readonly %ptr.1) unnamed_addr #3 {
start: 
  %0 = bitcast {}* %ptr.1 to i64*
  %1 = getelementptr inbounds i64, i64* %0, i64 1
  %2 = load i64, i64* %1, align 8, !invariant.load !6
  %3 = icmp eq i64 %2, 0
  br i1 %3, label %bb6, label %bb3
  
bb3:                                              ; preds = %start
  %4 = getelementptr inbounds i64, i64* %0, i64 2
  %5 = load i64, i64* %4, align 8, !invariant.load !6
  %6 = bitcast {}* %ptr.0 to i8*
  tail call void @__rust_dealloc(i8* %6, i64 %2, i64 %5) #9
  br label %bb6
  
bb6:                                              ; preds = %start, %bb3
  ret void
} 

after:

; alloc::alloc::box_free
; Function Attrs: inlinehint nounwind uwtable
define internal fastcc void @_ZN5alloc5alloc8box_free17hdda4f80f00ae9ce2E(i8* nonnull %ptr.0, i8* noalias nonnull readonly %ptr.1) unnamed_addr #3 {
start:
  %0 = getelementptr inbounds i8, i8* %ptr.1, i64 8
  %1 = bitcast i8* %0 to i64*
  %2 = load i64, i64* %1, align 8, !invariant.load !6
  %3 = icmp eq i64 %2, 0
  br i1 %3, label %bb7, label %bb4

bb4:                                              ; preds = %start
  %4 = getelementptr inbounds i8, i8* %ptr.1, i64 16
  %5 = bitcast i8* %4 to i64*
  %6 = load i64, i64* %5, align 8, !invariant.load !6
  tail call void @__rust_dealloc(i8* nonnull %ptr.0, i64 %2, i64 %6) #9
  br label %bb7

bb7:                                              ; preds = %start, %bb4
  ret void
}

; alloc::alloc::box_free
; Function Attrs: inlinehint nounwind uwtable
define internal fastcc void @_ZN5alloc5alloc8box_free17hf06ac9a184e5da38E(i64* nonnull %ptr) unnamed_addr #3 {
start:
  %0 = bitcast i64* %ptr to i8*
  tail call void @__rust_dealloc(i8* nonnull %0, i64 24, i64 8) #9
  ret void
}

and in callers, before:

cleanup.body.i.i.i.i.i:                           ; preds = %bb2.i.i.i.i
  %23 = landingpad { i8*, i32 }
          cleanup
  %24 = load {}*, {}** %8, align 8, !nonnull !6
  %25 = load {}*, {}** %10, align 8, !nonnull !6
; call alloc::alloc::box_free
  tail call fastcc void @_ZN5alloc5alloc8box_free17h95ac39da6ae7db02E({}* nonnull %24, {}* noalias nonnull readonly %25) #11
  %26 = load %"std::io::error::Custom"*, %"std::io::error::Custom"** %6, align 8, !nonnull !6
; call alloc::alloc::box_free
  tail call fastcc void @_ZN5alloc5alloc8box_free17h74d9b2058bbfc89bE(%"std::io::error::Custom"* nonnull %26) #11
  resume { i8*, i32 } %23

after:

cleanup.body.i.i.i.i.i:                           ; preds = %bb2.i.i.i.i
  %25 = landingpad { i8*, i32 }
          cleanup
  %26 = bitcast %"std::io::error::Custom"* %7 to i8**
  %27 = load i8*, i8** %26, align 8, !nonnull !6
  %28 = bitcast {}** %10 to i8**
  %29 = load i8*, i8** %28, align 8, !nonnull !6
; call alloc::alloc::box_free
  tail call fastcc void @_ZN5alloc5alloc8box_free17hdda4f80f00ae9ce2E(i8* nonnull %27, i8* noalias nonnull readonly %29) #11
  %30 = bitcast i32* %5 to i64**
  %31 = load i64*, i64** %30, align 8, !nonnull !6
; call alloc::alloc::box_free
  tail call fastcc void @_ZN5alloc5alloc8box_free17hf06ac9a184e5da38E(i64* nonnull %31) #11
  resume { i8*, i32 } %25

(that's from the output of --emit llvm-ir, which is not what rustc generates for llvm, that's after at least some llvm passes)

@eddyb what do you think? do you need more information?

@eddyb
Copy link
Member

eddyb commented May 11, 2018

@glandium What happens if this use of tcx.mk_nil() is replaced with tcx.types.usize?

tcx.mk_imm_ref(tcx.types.re_static, tcx.mk_nil())

@glandium
Copy link
Contributor Author

glandium commented May 11, 2018

@eddyb the ir changed, but perf didn't, it's still slower to compile.

@eddyb
Copy link
Member

eddyb commented May 11, 2018

@glandium Wait, what's the difference on the IR then? It should be almost identical.

@glandium
Copy link
Contributor Author

@eddyb

  • some {}* and i8* become i64*
  • some bitcasts go away
  • some dereferenceable(8) attributes are added
  • some nonnull attributes are removed

@eddyb
Copy link
Member

eddyb commented May 12, 2018

@glandium No, I mean, the differences you listed in #50097 (comment).
The "before" and "after" should be much more similar given the patch in question.
Also, note that dereferenceable implies nonnull, so that's expected.

@glandium
Copy link
Contributor Author

@eddyp unfortunately, they're not.

@eddyp
Copy link
Contributor

eddyp commented May 12, 2018

@glandium I think you meant to @eddyb your reply

@glandium
Copy link
Contributor Author

@eddyb (per irc request) https://gist.github.com/glandium/9baea047c63642fe8649d4d7ab690d13
The files prefixed with backout- are the output from a compiler with the last commit from this PR backed out.
The files with no prefix are the output from an unpatched compiler.
The files prefixed with patch- are the output from a compiler patched with the change from #50097 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants