Apparent 1000% regression in llvm phase of bootstrap #34831

nrc · 2016-07-15T04:26:56Z

Given that there has been no shouting, this could be due to something which only affects the perf server. Should still investigate.

cc @rust-lang/compiler

nagisa · 2016-07-17T21:23:13Z

Given #34891 (comment) #33890 may be the cause,

…=eddyb Run base::internalize_symbols() even for single-codegen-unit crates. The initial linkage-assignment (especially for closures) is a conservative one that makes some symbols more visible than they need to be. While this is not a correctness problem, it does force the LLVM inliner to be more conservative too, which results in poor performance. Once translation is based solely on MIR, it will be easier to also make the initial linkage assignment a better fitting one. Until then `internalize_symbols()` does a good job of preventing most performance regressions. This should solve the regressions reported in #34891 and maybe also those in #34831. As a side-effect, this will also solve most of the problematic cases described in #34793. Not reliably so, however. For that, we still need a solution like the one implement in #34830. cc @rust-lang/compiler

nikomatsakis · 2016-07-21T20:58:41Z

triage: P-high

nagisa · 2016-07-23T12:26:15Z

#34917 (might be a fix) landed 19 hours ago only.

alexcrichton · 2016-07-25T04:46:48Z

Copying over regression/I-slow tags from #34998, which I"m closing in favor of this.

@nikomatsakis as a P-high bug, do you know if someone would be suitable to assign to this?

nagisa · 2016-07-25T07:24:22Z

We still haven’t any measurements with nightly containing the PR I linked above. 2016-07-13 seems to be the last measurement done. @nrc is there anyway a manual run could be invoked?

pmarcelll · 2016-07-25T16:33:02Z

@nagisa I wanted to benchmark my small BF interpreter (it slowed down a bit after the 2016-06-24 nightly) but unfortunately there's no released nightly containing that PR, the bots are failing due to #35002. @alexcrichton is there a way to temporarily increase the timeout length on the bots? It would be a short-term solution and we would have a new nightly (hopefully).

alexcrichton · 2016-07-25T16:59:44Z

@pmarcelll yeah I'm trying to get a nightly out, not sure what's all been going on :(

brson · 2016-07-25T18:29:57Z

It would be a huge mistake to release another version of Rust without fixing this.

arielb1 · 2016-07-25T18:40:36Z

I could not see any regression with -Z orbit.

michaelwoerister · 2016-07-25T18:58:56Z

If the compile time regression is indeed related to the collector-driven trans changes, then my guess would be that the increased amount of IR pushes LLVM over the amount of memory fitting into the RAM of the perf.rust-lang machine or something similar. I don't think I've run into such a pronounced increase of bootstrapping time when building locally.

nikomatsakis · 2016-07-25T21:37:28Z

Has anyone tried to reproduce the regr? That is, comparing:

just before
just after
trunk

scottcarr · 2016-07-25T22:48:32Z

I don't reproduce the regression. I tried building hyper.0.5.0 which according to perf.rust-lang.org has a 350% slow down.

Without -Z orbit I get:

nightly-2016-06-23: 31.41s
nightly-2016-07-07: 31.61s
master (as of 2016-07-25): 29.40s

With -Z orbit I get:

nightly-2016-06-23: 33.40s
nightly-2016-07-07: 32.52s
master (as of 2016-07-25): 28.89s

nagisa · 2016-07-27T19:01:27Z

@nrc despite us having a 07-25 nightly build, the last perf run still seems to be at 07-13. Could one be triggered manually or something?

nrc · 2016-07-27T23:01:47Z

@nagisa frontend server needs a kick, then that should be updated. I've been fiddling with the backend (trying to get -Zorbit results) and so that is offline atm, should be fixed soon though.

nikomatsakis · 2016-07-28T00:08:01Z

Just FYI I think @edunham kicked the front end server.

On Jul 27, 2016 19:02, "Nick Cameron" notifications@github.com wrote:

@nagisa https://github.com/nagisa frontend server needs a kick, then
that should be updated. I've been fiddling with the backend (trying to get
-Zorbit results) and so that is offline atm, should be fixed soon though.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#34831 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJeZvgvVVzlTdYsyaNqDeWzgVskPpKTks5qZ-N5gaJpZM4JNEPW
.

nikomatsakis · 2016-07-28T13:59:59Z

So there is a new measurement now (week of July 23) and the 1000% drop does not seem to be improved. I was wondering about @michaelwoerister's hypothesis that memory usage might be to blame, so I loaded up this graph charting the memory usage. Indeed there is a huge spike on July 1 -- note that the perf drop was measured on July 2nd. Seems like a potential link!

nagisa · 2016-07-28T15:50:10Z

huge spike

Note that the graph axis are not numbered from 0, which is why the spike looks so big. The increase is a bit less than 5%, which is to be expected since we started book-keeping around a big amount of stuff.

nikomatsakis · 2016-07-28T15:58:52Z

Note that the graph axis are not numbered from 0, which is why the spike looks so big. The increase is a bit less than 5%, which is to be expected since we started book-keeping around a big amount of stuff.

Ha! Thanks. Oldest trick in the book, and I fell for it. ;)

nikomatsakis · 2016-07-28T16:46:30Z

OK, so, I just tried bootstrapping (make rustc-stage1 && time make all) both before/after the regresson locally and I did not find much of a difference. Before 33m, after 35m.

nikomatsakis · 2016-07-28T16:46:48Z

(Note that I was doing other things etc, but I figured a 1000% regression would be visible. :)

nikomatsakis · 2016-07-29T16:50:10Z

@nrc I wonder if it would make sense to re-run the benchmark results on the perf machine? I guess we've seen them repeat numerous times, but it seems like a phenomena that is just not being witnessed from other setups?

nrc · 2016-07-31T23:04:52Z

I've run the bootstrap on the perf machine a few times and it does really seem to take that long. I currently have an old commit running to see if something changed in the environment. Next step is to try a current commit but without the scripts in case something in the scripts is making it take super long.

eddyb · 2016-07-31T23:12:08Z

@nrc How many hours is "that long"?

nrc · 2016-08-01T04:05:04Z

@eddyb 48?

Aatch · 2016-08-01T05:30:45Z

@nrc sounds like it might be specific to the perf machine then. If full builds were regularly taking even a tenth of that to run, we'd know about it elsewhere.

nrc · 2016-08-01T19:44:03Z

@Aatch to be clear, that is a full run of the perf measurements, which is multiple bootstraps + 6 stage builds + 10 of each benchmark

eddyb · 2016-08-01T19:50:50Z

On IRC @nrc pointed out that building just stage2 jumped from 13m to 2h30m, so ~12x.

alexcrichton · 2016-08-04T16:22:02Z

It was pointed out on IRC that this was happening maybe because LLVM was being compiled in debug mode. With that in mind I now remember running into this problem on #34559 where the Android builder was randomly switching LLVM to debug mode, and the fix was to update CMake from 2.8 to something else.

As of a few days ago the in-tree LLVM requires CMake 3.4 to compile, so future contributors shouldn't run into this, but @nrc do you know what the cmake version is on the perf bot?

nrc · 2016-08-05T04:20:34Z

\o/ This was fixed with a complete nuke of the perf server and a cmake upgrade.

nrc added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 15, 2016

nagisa mentioned this issue Jul 17, 2016

Performance regression in nightly #34891

Closed

michaelwoerister mentioned this issue Jul 18, 2016

Run base::internalize_symbols() even for single-codegen-unit crates. #34899

Merged

rust-highfive added the P-high High priority label Jul 21, 2016

nagisa mentioned this issue Jul 24, 2016

Major compile time regression for regex crate #34998

Closed

alexcrichton added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jul 25, 2016

Mark-Simulacrum mentioned this issue Aug 4, 2016

Port NodeJS backend into Rust. rust-lang/rustc-perf#22

Merged

brson added P-medium Medium priority A-infrastructure and removed P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Aug 4, 2016

nrc closed this as completed Aug 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apparent 1000% regression in llvm phase of bootstrap #34831

Apparent 1000% regression in llvm phase of bootstrap #34831

nrc commented Jul 15, 2016

nagisa commented Jul 17, 2016 •

edited

Loading

nikomatsakis commented Jul 21, 2016

nagisa commented Jul 23, 2016

alexcrichton commented Jul 25, 2016

nagisa commented Jul 25, 2016

pmarcelll commented Jul 25, 2016

alexcrichton commented Jul 25, 2016

brson commented Jul 25, 2016

arielb1 commented Jul 25, 2016

michaelwoerister commented Jul 25, 2016

nikomatsakis commented Jul 25, 2016

scottcarr commented Jul 25, 2016 •

edited

Loading

nagisa commented Jul 27, 2016

nrc commented Jul 27, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016 •

edited

Loading

nagisa commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 29, 2016

nrc commented Jul 31, 2016

eddyb commented Jul 31, 2016

nrc commented Aug 1, 2016

Aatch commented Aug 1, 2016

nrc commented Aug 1, 2016

eddyb commented Aug 1, 2016

alexcrichton commented Aug 4, 2016

nrc commented Aug 5, 2016

Apparent 1000% regression in llvm phase of bootstrap #34831

Apparent 1000% regression in llvm phase of bootstrap #34831

Comments

nrc commented Jul 15, 2016

nagisa commented Jul 17, 2016 • edited Loading

nikomatsakis commented Jul 21, 2016

nagisa commented Jul 23, 2016

alexcrichton commented Jul 25, 2016

nagisa commented Jul 25, 2016

pmarcelll commented Jul 25, 2016

alexcrichton commented Jul 25, 2016

brson commented Jul 25, 2016

arielb1 commented Jul 25, 2016

michaelwoerister commented Jul 25, 2016

nikomatsakis commented Jul 25, 2016

scottcarr commented Jul 25, 2016 • edited Loading

nagisa commented Jul 27, 2016

nrc commented Jul 27, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016 • edited Loading

nagisa commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 28, 2016

nikomatsakis commented Jul 29, 2016

nrc commented Jul 31, 2016

eddyb commented Jul 31, 2016

nrc commented Aug 1, 2016

Aatch commented Aug 1, 2016

nrc commented Aug 1, 2016

eddyb commented Aug 1, 2016

alexcrichton commented Aug 4, 2016

nrc commented Aug 5, 2016

nagisa commented Jul 17, 2016 •

edited

Loading

scottcarr commented Jul 25, 2016 •

edited

Loading

nikomatsakis commented Jul 28, 2016 •

edited

Loading