Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch stdenv to GCC 4.8.0. #445

Merged
merged 2 commits into from
Apr 10, 2013

Conversation

peti
Copy link
Member

@peti peti commented Apr 4, 2013

I would suggest that our next stdenv update goes to gcc 4.8.x directly, skipping 4.7.x.

@vcunat
Copy link
Member

vcunat commented Apr 4, 2013

Well, after reading some references I believe it's a very good idea. Broken builds should be only a tiny fraction in comparison with what 4.7 broke (except if one uses -Werror, but that's rare).

http://gcc.gnu.org/gcc-4.7/porting_to.html
http://gcc.gnu.org/gcc-4.8/porting_to.html
https://lists.fedoraproject.org/pipermail/devel/2013-January/175876.html
http://openbenchmarking.org/result/1303255-FO-GCC48INTE29

@peti
Copy link
Member Author

peti commented Apr 6, 2013

Check out the build errors in this branch at http://hydra.cryp.to:8080/jobset/nixpkgs/switch-stdenv-to-gcc-48/errors. GCC 4.8.0 does pretty well, IMHO. Packages using -Werror cause trouble, as expected, but those are easy to fix. Other build issues are nastier: aterm28, for example, segfaults in the test suite when built with GCC 4.8, and the build itself produces some warnings that look serious.

Anyway, you can get binaries for x86_64-linux from that branch by subscribing to its Hydra channel:

nix-channel --add http://hydra.cryp.to:8080/jobset/nixpkgs/switch-stdenv-to-gcc-48/channel/latest

@peti peti closed this Apr 6, 2013
@peti peti reopened this Apr 6, 2013
@peti
Copy link
Member Author

peti commented Apr 6, 2013

Hmpf, I didn't intend to close the request. Sorry, wrong button. Re-opening ...

@vcunat
Copy link
Member

vcunat commented Apr 6, 2013

The gcj job doesn't work on stdenv-updates either. I believe all pre gcc-4.7 versions are broken, due to ppl update (I think). I tried to just copy the expression to use gcj-4.7 by default which built well, but then I encountered a problem in openjdk, where -lgcj couldn't find the library (I don't have a clue why).

I think we should:

  • switch other languages to 4.7 or 4.8 by default, and
  • resurrect older library versions as I suppose some people will still be interested in some pre-4.7 gcc versions.

@vcunat
Copy link
Member

vcunat commented Apr 6, 2013

I also think we should use the profiled version of gcc. The only drawback is that with their makefiles this forces us to single-thread build of gcc as is commented -- there's some really silly reason that the profiling output files could rewrite each other or the like... but nevertheless I think it's worth it, gcc will only get rebuilt rarely.

@peti
Copy link
Member Author

peti commented Apr 6, 2013

@vcunat, I agree that we should use a profiled version of gcc. I don't want to make that change in this particular pull request, though, because then I'll have to rebuild the entire stdenv one more time. Personally, I'd prefer to do that once we've committed to gcc 4.8.x as our new stdenv compiler of choice.

@vcunat
Copy link
Member

vcunat commented Apr 6, 2013

@peti: Yes, I thought this may be the reason. I'm building profiled stdenv with gcc48 right now, to at least a bit try it.

@vcunat
Copy link
Member

vcunat commented Apr 7, 2013

I tried to build and run some packages: firefox, evince, vlc and lyx. No changes were needed and I encountered no errors (used just plain stdenv-updates with gcc48+PGO). Looks very good.

@viric
Copy link
Member

viric commented Apr 7, 2013

On Sat, Apr 06, 2013 at 05:15:12AM -0700, Vladimír Čunát wrote:

I also think we should use the profiled version of gcc. The only drawback
is that with their makefiles this forces us to single-thread build of gcc as
is commented -- there's some really silly reason that the profiling output
files could rewrite each other or the like... but nevertheless I think it's
worth it, gcc will only get rebuilt rarely.

I'd prefer to keep the profiled gcc to x86 only. Do you agree?

@vcunat
Copy link
Member

vcunat commented Apr 7, 2013

Well, I've got no knowledge about PGO on other HW, so I can't really decide that (and therefore I'm certainly not against).

@viric
Copy link
Member

viric commented Apr 7, 2013

On Sun, Apr 07, 2013 at 12:40:58PM -0700, Vladimír Čunát wrote:

Well, I've got no knowledge about PGO on other HW, so I can't really decide that (and therefore I'm certainly not against).

It just takes a lot to build, in the slow hw I talk about. :) It requires more
memory too.

peti added 2 commits April 8, 2013 01:24
The test t-lucnum_ui fails (on Linux/x86_64) when built with GCC 4.8.
Newer versions of GMP don't have that issue anymore.
@vcunat
Copy link
Member

vcunat commented Apr 8, 2013

I see it's in stdenv-updates now. So according to @viric we shall use something like this?

profiledCompiler = with stdenv; (isi686 || isx86_64);

BTW, once we stabilize stdenv, we'll need to widen the amount of compiled packages. I'm sure there are still many build errors (currently not shown) due to gcc-4.7 step. IMHO we'll want to fix most of those before merging to master (usually easy via update or a patch from some other distro).

@peti
Copy link
Member Author

peti commented Apr 9, 2013

I would like to merge this change soon'ish if possible. Does anyone see a compelling reason not to try to switch stdenv to gcc 4.8.x? If you do, please let me know!

peti added a commit that referenced this pull request Apr 10, 2013
@peti peti merged commit 7655801 into NixOS:stdenv-updates Apr 10, 2013
@peti peti deleted the switch-stdenv-to-gcc-4.8.0 branch April 10, 2013 06:31
@vcunat
Copy link
Member

vcunat commented Apr 10, 2013

And what about the profiling? Shall I commit the above change (PGO iff on x86*)?

@peti
Copy link
Member Author

peti commented Apr 10, 2013

Yes, please do! I too would like to have PGO enabled.

vcunat added a commit that referenced this pull request Apr 10, 2013
@vcunat
Copy link
Member

vcunat commented Apr 10, 2013

We didn't catch it, so hydra will build both, but fortunately the jobset is quite small now.

@peti
Copy link
Member Author

peti commented Apr 10, 2013

Does PGO work with "enableParallelBuilding = true"?

@vcunat
Copy link
Member

vcunat commented Apr 10, 2013

Probably not. I re-checked the documentation of gcc-4.8 and the comment is still there (but it really seems like a silly reason, IMHO only makefiles would need fixing).

@vcunat
Copy link
Member

vcunat commented Apr 10, 2013

I feel that's the biggest disadvantage of PGO at the moment as it further slows down gcc's build. OTOH we should only rebuild it very rarely.

@alexanderkjeldaas
Copy link
Contributor

I want to turn off PGO. We cannot get a deterministic result for gcc itself with PGO because the timing information is slightly random.

@shlevy
Copy link
Member

shlevy commented Apr 14, 2014

I don't think we should turn off PGO by default without good profiling information verifying that there's no significant loss in performance. Users who absolutely need deterministic builds can of course use a non-profiled version.

@peti
Copy link
Member Author

peti commented Apr 14, 2014

PGO achieves significant performance gains. What advantages would
deterministic builds offer to our users that outweigh this loss?

@vcunat
Copy link
Member

vcunat commented Apr 14, 2014

Hmm, I thought that generation of profile information was deterministic (assuming that we run a deterministic program). In docs I see nothing that would confirm either of the possibilities. http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options

I found some data that confirm the speedup is likely singificant.
http://gcc.gnu.org/ml/gcc/2013-03/msg00210.html
Just imagine what difference this makes to Hydra.

Note: while the gcc instances might differ, the results produced by each should certainly be binary equal, so the situation doesn't seem too bad to me.

@alexanderkjeldaas
Copy link
Contributor

On Mon, Apr 14, 2014 at 4:15 PM, Vladimír Čunát notifications@git.luolix.topwrote:

Hmm, I thought that generation of profile information was deterministic
(assuming that we run a deterministic program). In docs I see nothing that
would confirm either of the options.
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options

I found some data that confirm the speedup is likely singificant.
http://gcc.gnu.org/ml/gcc/2013-03/msg00210.html
Just imagine what difference this makes to Hydra.

Note: while the gcc instances might differ, the results produced by each
should certainly be binary equal, so the situation doesn't seem too bad to
me.

I don't think it is worth sweating over a 10% performance issue.
Deterministic builds have architectual advantages that can give several
orders of magnitude
faster builds both by using ccache and
untrusted/semi-trusted machines.

ccache itself can give an order of magnitude faster builds. Already, with
the gcc-wrapper changes to fix DATE and TIME, hydra can use ccache
extensively together with affinity scheduling of tasks (no nfs).

For deterministic builds, semi-trusted machines can also be added to
Hydra. These machines can build the leaves of the dependency DAG, and
hydra can randomly check that the results are deterministic.

These two changes could in theory boost hydra performance by an order of
magnitude, not just 10%.

Now that does not preclude having PGO in gcc, but in light of those
possibilities, I think it is better to take the security win of having a
deterministic gcc and only get a 90x instead of a 100x build speedup.

The security win is that when gcc is deterministic, no backdoor can be
introduced into gcc unless it is already there. A backdoor in gcc can
compromise everything in the system by compromising key components such as
the sha256sum calculation and putting various backdoors into the system.

Without a deterministic gcc, no security advantage is gained from
deterministic builds, because the determinism is only calculated by tools
that can be compromised. Thus with a non-deterministic gcc, there is no
guarantee that you and I have the same binary even though sha256 says so.
Basically there are no guarantees at all, which means that hydra must be
100% trusted.

@edolstra
Copy link
Member

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's impure (the cache is a global shared variable, i.e. stateful)...

@alexanderkjeldaas
Copy link
Contributor

Being able to use ccache with hydra seems like something that should be
fixed when builds are repeatable. All object results from ccache can in
theory be converted into fixed output derivations in nix, so it is pure.

In my usage of NixOS, I must have deterministic builds because I am using
trusted computing. I cannot introduce any new software to my environment
without knowing that the compiler is already trusted, and without a known
checksum for the compiler, that just isn't possible.

It is possible to get around this by creating a NixOS distribution that
retains stdenvLinuxBoot4 and has all the other packages that are
deterministic.

Then, when gcc is actually needed because a derivation is compiled locally,
the real stdenv is bootstrapped from stdenvLinuxBoot4, resulting in a gcc
with PGO. It is pretty hairy, but it might be doable.

Removing PGO from stdenv makes this much simpler and cleaner.

On Mon, Apr 14, 2014 at 4:48 PM, Eelco Dolstra notifications@git.luolix.topwrote:

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's
impure (the cache is a global shared variable, i.e. stateful)...


Reply to this email directly or view it on GitHubhttps://github.com//pull/445#issuecomment-40373979
.

@shlevy
Copy link
Member

shlevy commented Apr 14, 2014

Theoretically with a Make implementation that runs each command as a derivation and recursive nix we might be able to get a more general and better-grounded version of ccache (and distcc). In the mean time though, it's not going to happen.

I'd rather not trade a demonstrated 10% increase for a theoretically possible 10x one. If you can show an actual implemented solution that has real performance gains but depends on bit-perfect determinism, fine. Until then, if we can't make PGO deterministic then users who need determinism will have to have a separate stdenv IMO.

@viric
Copy link
Member

viric commented Apr 14, 2014

On Mon, Apr 14, 2014 at 07:48:15AM -0700, Eelco Dolstra wrote:

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's impure (the cache is a global shared variable, i.e. stateful)...

and so is the nix store, hashes linking inputs and outputs, no? :)

It should be like some kind of controlled impurity.

@vcunat
Copy link
Member

vcunat commented Apr 14, 2014

I don't think that caching was meant to be an important point in this discussion. Anyway, my view is below.

Ccache-like caching IMO doesn't really need binary repeatability, and current "semantic" repeatability of gcc is enough for it (i.e. the generated code does the same thing and has the same ABI). PGO by definition must not affect any of these properties.

BTW, I think we can do similar "memoization" much better than ccache (certainly more efficient), because we have much better than usual control on what are the inputs of each compilation command (most of the inputs will be on immutable paths, etc.).

@alexanderkjeldaas
Copy link
Contributor

On Mon, Apr 14, 2014 at 5:37 PM, Shea Levy notifications@github.com wrote:

Theoretically with a Make implementation that runs each command as a
derivation and recursive nix we might be able to get a more general and
better-grounded version of ccache (and distcc). In the mean time though,
it's not going to happen.

I'd rather not trade a demonstrated 10% increase for a theoretically
possible 10x one. If you can show an actual implemented solution that has
real performance gains but depends on bit-perfect determinism, fine. Until
then, if we can't make PGO deterministic then users who need determinism
will have to have a separate stdenv IMO.

This is a too simplified view of the trade-off. It is deterministic builds
that reduces the attack surface to such that a "completely open" build
system is possible. Deterministic builds is an enabler for lots of things
that increases security and build performance.

This "sacrifice" is similar to the performance penalty for using
encryption. It only matters under attack.

To me it is hard to understand how a 10% increase in build performance is
worth sacrificing the added security for the whole distribution. This
performance is a smaller difference than deciding between -O2 and -O3 while
it enables a completely different level of security for the distribution.

Asking people to use their own stdenv is not a solution in this case,
because it is the trust in the distribution itself that is being damaged.
The distributed ISO contains a non-deterministic gcc thus the ISO cannot be
verified by a third party.

As I mentioned, it is possible to overcome this by requiring every end user
to compile their own 3-stage profilebostrap version of gcc prior to
compiling anything locally.

This means that what is distributed, the ISO that is downloaded, is
deterministic, but in includes only a bootstrap stdenv that is used to
create the final stdenv which has a gcc with PGO.

That will have to add another stage to the stdenv bootstrap process, but
building a non-PGO gcc is fairly fast compared to the PGO one.

If that's an acceptable road to follow, I'd like to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants