Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apparent 1000% regression in llvm phase of bootstrap #34831

Closed
nrc opened this issue Jul 15, 2016 · 29 comments
Closed

Apparent 1000% regression in llvm phase of bootstrap #34831

nrc opened this issue Jul 15, 2016 · 29 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nrc
Copy link
Member

nrc commented Jul 15, 2016

See perf

Given that there has been no shouting, this could be due to something which only affects the perf server. Should still investigate.

cc @rust-lang/compiler

@nrc nrc added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 15, 2016
@nagisa
Copy link
Member

nagisa commented Jul 17, 2016

Given #34891 (comment) #33890 may be the cause,

bors added a commit that referenced this issue Jul 18, 2016
…=eddyb

Run base::internalize_symbols() even for single-codegen-unit crates.

The initial linkage-assignment (especially for closures) is a conservative one that makes some symbols more visible than they need to be. While this is not a correctness problem, it does force the LLVM inliner to be more conservative too, which results in poor performance. Once translation is based solely on MIR, it will be easier to also make the initial linkage assignment a better fitting one. Until then `internalize_symbols()` does a good job of preventing most performance regressions.

This should solve the regressions reported in #34891 and maybe also those in #34831.

As a side-effect, this will also solve most of the problematic cases described in #34793. Not reliably so, however. For that, we still need a solution like the one implement in #34830.

cc @rust-lang/compiler
@nikomatsakis
Copy link
Contributor

triage: P-high

@rust-highfive rust-highfive added the P-high High priority label Jul 21, 2016
@nagisa
Copy link
Member

nagisa commented Jul 23, 2016

#34917 (might be a fix) landed 19 hours ago only.

@alexcrichton alexcrichton added regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jul 25, 2016
@alexcrichton
Copy link
Member

Copying over regression/I-slow tags from #34998, which I"m closing in favor of this.

@nikomatsakis as a P-high bug, do you know if someone would be suitable to assign to this?

@nagisa
Copy link
Member

nagisa commented Jul 25, 2016

We still haven’t any measurements with nightly containing the PR I linked above. 2016-07-13 seems to be the last measurement done. @nrc is there anyway a manual run could be invoked?

@pmarcelll
Copy link
Contributor

@nagisa I wanted to benchmark my small BF interpreter (it slowed down a bit after the 2016-06-24 nightly) but unfortunately there's no released nightly containing that PR, the bots are failing due to #35002. @alexcrichton is there a way to temporarily increase the timeout length on the bots? It would be a short-term solution and we would have a new nightly (hopefully).

@alexcrichton
Copy link
Member

@pmarcelll yeah I'm trying to get a nightly out, not sure what's all been going on :(

@brson
Copy link
Contributor

brson commented Jul 25, 2016

It would be a huge mistake to release another version of Rust without fixing this.

@arielb1
Copy link
Contributor

arielb1 commented Jul 25, 2016

I could not see any regression with -Z orbit.

@michaelwoerister
Copy link
Member

If the compile time regression is indeed related to the collector-driven trans changes, then my guess would be that the increased amount of IR pushes LLVM over the amount of memory fitting into the RAM of the perf.rust-lang machine or something similar. I don't think I've run into such a pronounced increase of bootstrapping time when building locally.

@nikomatsakis
Copy link
Contributor

Has anyone tried to reproduce the regr? That is, comparing:

  • just before
  • just after
  • trunk

@scottcarr
Copy link
Contributor

scottcarr commented Jul 25, 2016

I don't reproduce the regression. I tried building hyper.0.5.0 which according to perf.rust-lang.org has a 350% slow down.

Without -Z orbit I get:

nightly-2016-06-23: 31.41s
nightly-2016-07-07: 31.61s
master (as of 2016-07-25): 29.40s

With -Z orbit I get:

nightly-2016-06-23: 33.40s
nightly-2016-07-07: 32.52s
master (as of 2016-07-25): 28.89s

@nagisa
Copy link
Member

nagisa commented Jul 27, 2016

@nrc despite us having a 07-25 nightly build, the last perf run still seems to be at 07-13. Could one be triggered manually or something?

@nrc
Copy link
Member Author

nrc commented Jul 27, 2016

@nagisa frontend server needs a kick, then that should be updated. I've been fiddling with the backend (trying to get -Zorbit results) and so that is offline atm, should be fixed soon though.

@nikomatsakis
Copy link
Contributor

Just FYI I think @edunham kicked the front end server.

On Jul 27, 2016 19:02, "Nick Cameron" notifications@github.com wrote:

@nagisa https://github.com/nagisa frontend server needs a kick, then
that should be updated. I've been fiddling with the backend (trying to get
-Zorbit results) and so that is offline atm, should be fixed soon though.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#34831 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJeZvgvVVzlTdYsyaNqDeWzgVskPpKTks5qZ-N5gaJpZM4JNEPW
.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jul 28, 2016

So there is a new measurement now (week of July 23) and the 1000% drop does not seem to be improved. I was wondering about @michaelwoerister's hypothesis that memory usage might be to blame, so I loaded up this graph charting the memory usage. Indeed there is a huge spike on July 1 -- note that the perf drop was measured on July 2nd. Seems like a potential link!

@nagisa
Copy link
Member

nagisa commented Jul 28, 2016

huge spike

Note that the graph axis are not numbered from 0, which is why the spike looks so big. The increase is a bit less than 5%, which is to be expected since we started book-keeping around a big amount of stuff.

@nikomatsakis
Copy link
Contributor

Note that the graph axis are not numbered from 0, which is why the spike looks so big. The increase is a bit less than 5%, which is to be expected since we started book-keeping around a big amount of stuff.

Ha! Thanks. Oldest trick in the book, and I fell for it. ;)

@nikomatsakis
Copy link
Contributor

OK, so, I just tried bootstrapping (make rustc-stage1 && time make all) both before/after the regresson locally and I did not find much of a difference. Before 33m, after 35m.

@nikomatsakis
Copy link
Contributor

(Note that I was doing other things etc, but I figured a 1000% regression would be visible. :)

@nikomatsakis
Copy link
Contributor

@nrc I wonder if it would make sense to re-run the benchmark results on the perf machine? I guess we've seen them repeat numerous times, but it seems like a phenomena that is just not being witnessed from other setups?

@nrc
Copy link
Member Author

nrc commented Jul 31, 2016

I've run the bootstrap on the perf machine a few times and it does really seem to take that long. I currently have an old commit running to see if something changed in the environment. Next step is to try a current commit but without the scripts in case something in the scripts is making it take super long.

@eddyb
Copy link
Member

eddyb commented Jul 31, 2016

@nrc How many hours is "that long"?

@nrc
Copy link
Member Author

nrc commented Aug 1, 2016

@eddyb 48?

@Aatch
Copy link
Contributor

Aatch commented Aug 1, 2016

@nrc sounds like it might be specific to the perf machine then. If full builds were regularly taking even a tenth of that to run, we'd know about it elsewhere.

@nrc
Copy link
Member Author

nrc commented Aug 1, 2016

@Aatch to be clear, that is a full run of the perf measurements, which is multiple bootstraps + 6 stage builds + 10 of each benchmark

@eddyb
Copy link
Member

eddyb commented Aug 1, 2016

On IRC @nrc pointed out that building just stage2 jumped from 13m to 2h30m, so ~12x.

@brson brson added P-medium Medium priority A-infrastructure and removed P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Aug 4, 2016
@alexcrichton
Copy link
Member

It was pointed out on IRC that this was happening maybe because LLVM was being compiled in debug mode. With that in mind I now remember running into this problem on #34559 where the Android builder was randomly switching LLVM to debug mode, and the fix was to update CMake from 2.8 to something else.

As of a few days ago the in-tree LLVM requires CMake 3.4 to compile, so future contributors shouldn't run into this, but @nrc do you know what the cmake version is on the perf bot?

@nrc
Copy link
Member Author

nrc commented Aug 5, 2016

\o/ This was fixed with a complete nuke of the perf server and a cmake upgrade.

@nrc nrc closed this as completed Aug 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests