Switch stdenv to GCC 4.8.0. #445

peti · 2013-04-04T13:59:58Z

I would suggest that our next stdenv update goes to gcc 4.8.x directly, skipping 4.7.x.

vcunat · 2013-04-04T14:28:41Z

Well, after reading some references I believe it's a very good idea. Broken builds should be only a tiny fraction in comparison with what 4.7 broke (except if one uses -Werror, but that's rare).

http://gcc.gnu.org/gcc-4.7/porting_to.html
http://gcc.gnu.org/gcc-4.8/porting_to.html
https://lists.fedoraproject.org/pipermail/devel/2013-January/175876.html
http://openbenchmarking.org/result/1303255-FO-GCC48INTE29

peti · 2013-04-06T12:00:00Z

Check out the build errors in this branch at http://hydra.cryp.to:8080/jobset/nixpkgs/switch-stdenv-to-gcc-48/errors. GCC 4.8.0 does pretty well, IMHO. Packages using -Werror cause trouble, as expected, but those are easy to fix. Other build issues are nastier: aterm28, for example, segfaults in the test suite when built with GCC 4.8, and the build itself produces some warnings that look serious.

Anyway, you can get binaries for x86_64-linux from that branch by subscribing to its Hydra channel:

nix-channel --add http://hydra.cryp.to:8080/jobset/nixpkgs/switch-stdenv-to-gcc-48/channel/latest

peti · 2013-04-06T12:00:34Z

Hmpf, I didn't intend to close the request. Sorry, wrong button. Re-opening ...

vcunat · 2013-04-06T12:11:38Z

The gcj job doesn't work on stdenv-updates either. I believe all pre gcc-4.7 versions are broken, due to ppl update (I think). I tried to just copy the expression to use gcj-4.7 by default which built well, but then I encountered a problem in openjdk, where -lgcj couldn't find the library (I don't have a clue why).

I think we should:

switch other languages to 4.7 or 4.8 by default, and
resurrect older library versions as I suppose some people will still be interested in some pre-4.7 gcc versions.

vcunat · 2013-04-06T12:15:09Z

I also think we should use the profiled version of gcc. The only drawback is that with their makefiles this forces us to single-thread build of gcc as is commented -- there's some really silly reason that the profiling output files could rewrite each other or the like... but nevertheless I think it's worth it, gcc will only get rebuilt rarely.

peti · 2013-04-06T12:21:52Z

@vcunat, I agree that we should use a profiled version of gcc. I don't want to make that change in this particular pull request, though, because then I'll have to rebuild the entire stdenv one more time. Personally, I'd prefer to do that once we've committed to gcc 4.8.x as our new stdenv compiler of choice.

vcunat · 2013-04-06T12:28:23Z

@peti: Yes, I thought this may be the reason. I'm building profiled stdenv with gcc48 right now, to at least a bit try it.

vcunat · 2013-04-07T13:23:52Z

I tried to build and run some packages: firefox, evince, vlc and lyx. No changes were needed and I encountered no errors (used just plain stdenv-updates with gcc48+PGO). Looks very good.

viric · 2013-04-07T15:35:53Z

On Sat, Apr 06, 2013 at 05:15:12AM -0700, Vladimír Čunát wrote:

I also think we should use the profiled version of gcc. The only drawback
is that with their makefiles this forces us to single-thread build of gcc as
is commented -- there's some really silly reason that the profiling output
files could rewrite each other or the like... but nevertheless I think it's
worth it, gcc will only get rebuilt rarely.

I'd prefer to keep the profiled gcc to x86 only. Do you agree?

vcunat · 2013-04-07T19:40:57Z

Well, I've got no knowledge about PGO on other HW, so I can't really decide that (and therefore I'm certainly not against).

viric · 2013-04-07T19:46:22Z

On Sun, Apr 07, 2013 at 12:40:58PM -0700, Vladimír Čunát wrote:

Well, I've got no knowledge about PGO on other HW, so I can't really decide that (and therefore I'm certainly not against).

It just takes a lot to build, in the slow hw I talk about. :) It requires more
memory too.

The test t-lucnum_ui fails (on Linux/x86_64) when built with GCC 4.8. Newer versions of GMP don't have that issue anymore.

vcunat · 2013-04-08T06:48:36Z

I see it's in stdenv-updates now. So according to @viric we shall use something like this?

profiledCompiler = with stdenv; (isi686 || isx86_64);

BTW, once we stabilize stdenv, we'll need to widen the amount of compiled packages. I'm sure there are still many build errors (currently not shown) due to gcc-4.7 step. IMHO we'll want to fix most of those before merging to master (usually easy via update or a patch from some other distro).

peti · 2013-04-09T10:09:59Z

I would like to merge this change soon'ish if possible. Does anyone see a compelling reason not to try to switch stdenv to gcc 4.8.x? If you do, please let me know!

Switch stdenv to GCC 4.8.0.

vcunat · 2013-04-10T06:52:44Z

And what about the profiling? Shall I commit the above change (PGO iff on x86*)?

peti · 2013-04-10T06:54:51Z

Yes, please do! I too would like to have PGO enabled.

vcunat · 2013-04-10T07:02:19Z

We didn't catch it, so hydra will build both, but fortunately the jobset is quite small now.

peti · 2013-04-10T07:10:37Z

Does PGO work with "enableParallelBuilding = true"?

vcunat · 2013-04-10T07:14:15Z

Probably not. I re-checked the documentation of gcc-4.8 and the comment is still there (but it really seems like a silly reason, IMHO only makefiles would need fixing).

vcunat · 2013-04-10T07:16:15Z

I feel that's the biggest disadvantage of PGO at the moment as it further slows down gcc's build. OTOH we should only rebuild it very rarely.

alexanderkjeldaas · 2014-04-14T13:50:15Z

I want to turn off PGO. We cannot get a deterministic result for gcc itself with PGO because the timing information is slightly random.

shlevy · 2014-04-14T13:56:07Z

I don't think we should turn off PGO by default without good profiling information verifying that there's no significant loss in performance. Users who absolutely need deterministic builds can of course use a non-profiled version.

peti · 2014-04-14T14:09:57Z

PGO achieves significant performance gains. What advantages would
deterministic builds offer to our users that outweigh this loss?

vcunat · 2014-04-14T14:14:57Z

Hmm, I thought that generation of profile information was deterministic (assuming that we run a deterministic program). In docs I see nothing that would confirm either of the possibilities. http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options

I found some data that confirm the speedup is likely singificant.
http://gcc.gnu.org/ml/gcc/2013-03/msg00210.html
Just imagine what difference this makes to Hydra.

Note: while the gcc instances might differ, the results produced by each should certainly be binary equal, so the situation doesn't seem too bad to me.

alexanderkjeldaas · 2014-04-14T14:42:11Z

On Mon, Apr 14, 2014 at 4:15 PM, Vladimír Čunát notifications@git.luolix.topwrote:

Hmm, I thought that generation of profile information was deterministic
(assuming that we run a deterministic program). In docs I see nothing that
would confirm either of the options.
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options

I found some data that confirm the speedup is likely singificant.
http://gcc.gnu.org/ml/gcc/2013-03/msg00210.html
Just imagine what difference this makes to Hydra.

Note: while the gcc instances might differ, the results produced by each
should certainly be binary equal, so the situation doesn't seem too bad to
me.

I don't think it is worth sweating over a 10% performance issue.
Deterministic builds have architectual advantages that can give several
orders of magnitude faster builds both by using ccache and
untrusted/semi-trusted machines.

ccache itself can give an order of magnitude faster builds. Already, with
the gcc-wrapper changes to fix DATE and TIME, hydra can use ccache
extensively together with affinity scheduling of tasks (no nfs).

For deterministic builds, semi-trusted machines can also be added to
Hydra. These machines can build the leaves of the dependency DAG, and
hydra can randomly check that the results are deterministic.

These two changes could in theory boost hydra performance by an order of
magnitude, not just 10%.

Now that does not preclude having PGO in gcc, but in light of those
possibilities, I think it is better to take the security win of having a
deterministic gcc and only get a 90x instead of a 100x build speedup.

The security win is that when gcc is deterministic, no backdoor can be
introduced into gcc unless it is already there. A backdoor in gcc can
compromise everything in the system by compromising key components such as
the sha256sum calculation and putting various backdoors into the system.

Without a deterministic gcc, no security advantage is gained from
deterministic builds, because the determinism is only calculated by tools
that can be compromised. Thus with a non-deterministic gcc, there is no
guarantee that you and I have the same binary even though sha256 says so.
Basically there are no guarantees at all, which means that hydra must be
100% trusted.

edolstra · 2014-04-14T14:48:12Z

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's impure (the cache is a global shared variable, i.e. stateful)...

alexanderkjeldaas · 2014-04-14T15:17:53Z

Being able to use ccache with hydra seems like something that should be
fixed when builds are repeatable. All object results from ccache can in
theory be converted into fixed output derivations in nix, so it is pure.

In my usage of NixOS, I must have deterministic builds because I am using
trusted computing. I cannot introduce any new software to my environment
without knowing that the compiler is already trusted, and without a known
checksum for the compiler, that just isn't possible.

It is possible to get around this by creating a NixOS distribution that
retains stdenvLinuxBoot4 and has all the other packages that are
deterministic.

Then, when gcc is actually needed because a derivation is compiled locally,
the real stdenv is bootstrapped from stdenvLinuxBoot4, resulting in a gcc
with PGO. It is pretty hairy, but it might be doable.

Removing PGO from stdenv makes this much simpler and cleaner.

On Mon, Apr 14, 2014 at 4:48 PM, Eelco Dolstra notifications@git.luolix.topwrote:

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's
impure (the cache is a global shared variable, i.e. stateful)...

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/445#issuecomment-40373979
.

shlevy · 2014-04-14T15:37:24Z

Theoretically with a Make implementation that runs each command as a derivation and recursive nix we might be able to get a more general and better-grounded version of ccache (and distcc). In the mean time though, it's not going to happen.

I'd rather not trade a demonstrated 10% increase for a theoretically possible 10x one. If you can show an actual implemented solution that has real performance gains but depends on bit-perfect determinism, fine. Until then, if we can't make PGO deterministic then users who need determinism will have to have a separate stdenv IMO.

viric · 2014-04-14T15:45:31Z

On Mon, Apr 14, 2014 at 07:48:15AM -0700, Eelco Dolstra wrote:

Unfortunately, Nix/Hydra won't be using ccache any time soon, because it's impure (the cache is a global shared variable, i.e. stateful)...

and so is the nix store, hashes linking inputs and outputs, no? :)

It should be like some kind of controlled impurity.

vcunat · 2014-04-14T16:33:08Z

I don't think that caching was meant to be an important point in this discussion. Anyway, my view is below.

Ccache-like caching IMO doesn't really need binary repeatability, and current "semantic" repeatability of gcc is enough for it (i.e. the generated code does the same thing and has the same ABI). PGO by definition must not affect any of these properties.

BTW, I think we can do similar "memoization" much better than ccache (certainly more efficient), because we have much better than usual control on what are the inputs of each compilation command (most of the inputs will be on immutable paths, etc.).

alexanderkjeldaas · 2014-04-14T20:54:36Z

On Mon, Apr 14, 2014 at 5:37 PM, Shea Levy notifications@github.com wrote:

Theoretically with a Make implementation that runs each command as a
derivation and recursive nix we might be able to get a more general and
better-grounded version of ccache (and distcc). In the mean time though,
it's not going to happen.

I'd rather not trade a demonstrated 10% increase for a theoretically
possible 10x one. If you can show an actual implemented solution that has
real performance gains but depends on bit-perfect determinism, fine. Until
then, if we can't make PGO deterministic then users who need determinism
will have to have a separate stdenv IMO.

This is a too simplified view of the trade-off. It is deterministic builds
that reduces the attack surface to such that a "completely open" build
system is possible. Deterministic builds is an enabler for lots of things
that increases security and build performance.

This "sacrifice" is similar to the performance penalty for using
encryption. It only matters under attack.

To me it is hard to understand how a 10% increase in build performance is
worth sacrificing the added security for the whole distribution. This
performance is a smaller difference than deciding between -O2 and -O3 while
it enables a completely different level of security for the distribution.

Asking people to use their own stdenv is not a solution in this case,
because it is the trust in the distribution itself that is being damaged.
The distributed ISO contains a non-deterministic gcc thus the ISO cannot be
verified by a third party.

As I mentioned, it is possible to overcome this by requiring every end user
to compile their own 3-stage profilebostrap version of gcc prior to
compiling anything locally.

This means that what is distributed, the ISO that is downloaded, is
deterministic, but in includes only a bootstrap stdenv that is used to
create the final stdenv which has a gcc with PGO.

That will have to add another stage to the stdenv bootstrap process, but
building a non-PGO gcc is fairly fast compared to the PGO one.

If that's an acceptable road to follow, I'd like to know.

peti closed this Apr 6, 2013

peti reopened this Apr 6, 2013

peti added 2 commits April 8, 2013 01:24

Switch stdenv to GCC 4.8.0.

e9b3071

gmp-4.3.2.nix: disable test suite to fix the build

e4f9d6b

The test t-lucnum_ui fails (on Linux/x86_64) when built with GCC 4.8. Newer versions of GMP don't have that issue anymore.

peti added a commit that referenced this pull request Apr 10, 2013

Merge pull request #445 from peti/switch-stdenv-to-gcc-4.8.0

7655801

Switch stdenv to GCC 4.8.0.

peti merged commit 7655801 into NixOS:stdenv-updates Apr 10, 2013

peti deleted the switch-stdenv-to-gcc-4.8.0 branch April 10, 2013 06:31

vcunat added a commit that referenced this pull request Apr 10, 2013

gcc48: turn on PGO on x86*, see #445

3e20806

This was referenced Feb 11, 2021

gcc10: build is not deterministic #108475

Closed

stdenv: provide a deterministically built gcc #112928

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch stdenv to GCC 4.8.0. #445

Switch stdenv to GCC 4.8.0. #445

peti commented Apr 4, 2013

vcunat commented Apr 4, 2013

peti commented Apr 6, 2013

peti commented Apr 6, 2013

vcunat commented Apr 6, 2013

vcunat commented Apr 6, 2013

peti commented Apr 6, 2013

vcunat commented Apr 6, 2013

vcunat commented Apr 7, 2013

viric commented Apr 7, 2013

vcunat commented Apr 7, 2013

viric commented Apr 7, 2013

vcunat commented Apr 8, 2013

peti commented Apr 9, 2013

vcunat commented Apr 10, 2013

peti commented Apr 10, 2013

vcunat commented Apr 10, 2013

peti commented Apr 10, 2013

vcunat commented Apr 10, 2013

vcunat commented Apr 10, 2013

alexanderkjeldaas commented Apr 14, 2014

shlevy commented Apr 14, 2014

peti commented Apr 14, 2014

vcunat commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014

edolstra commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014

shlevy commented Apr 14, 2014

viric commented Apr 14, 2014

vcunat commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014

Switch stdenv to GCC 4.8.0. #445

Switch stdenv to GCC 4.8.0. #445

Conversation

peti commented Apr 4, 2013

vcunat commented Apr 4, 2013

peti commented Apr 6, 2013

peti commented Apr 6, 2013

vcunat commented Apr 6, 2013

vcunat commented Apr 6, 2013

peti commented Apr 6, 2013

vcunat commented Apr 6, 2013

vcunat commented Apr 7, 2013

viric commented Apr 7, 2013

vcunat commented Apr 7, 2013

viric commented Apr 7, 2013

vcunat commented Apr 8, 2013

peti commented Apr 9, 2013

vcunat commented Apr 10, 2013

peti commented Apr 10, 2013

vcunat commented Apr 10, 2013

peti commented Apr 10, 2013

vcunat commented Apr 10, 2013

vcunat commented Apr 10, 2013

alexanderkjeldaas commented Apr 14, 2014

shlevy commented Apr 14, 2014

peti commented Apr 14, 2014

vcunat commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014

edolstra commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014

shlevy commented Apr 14, 2014

viric commented Apr 14, 2014

vcunat commented Apr 14, 2014

alexanderkjeldaas commented Apr 14, 2014