Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

haskell.compiler.ghc865Binary: add powerpc64le bootstrap #168213

Merged
merged 2 commits into from Jul 1, 2022
Merged

haskell.compiler.ghc865Binary: add powerpc64le bootstrap #168213

merged 2 commits into from Jul 1, 2022

Conversation

ghost
Copy link

@ghost ghost commented Apr 11, 2022

Marked as draft until staging reopens on 2022-May-15.

Description of changes

This PR adds the ghc 8.6.5 bootstrap binaries for powerpc64le, as recommended by @sternenseemann here:

#168113 (comment)

Things done
  • Built on platform(s)
    • powerpc64le-linux
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Fits CONTRIBUTING.md.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 1-10 labels Apr 11, 2022
@ghost ghost marked this pull request as ready for review April 11, 2022 04:57
Copy link
Member

@sternenseemann sternenseemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll convert this to a draft until #168199 is merged after which this will become useful. Are there plans for CI / testing / Hydra jobset already?

Unfortunately it seems that GHC HQ's powerpc builders broke or were decomissioned some time after 8.6.5, so we can't bootstrap anything after 8.10.7 (GHC only supports the previous two major releases for bootstrapping). We can establish a chain for now, i.e. build ``ghc902, `ghc922` and `ghcHEAD` using `ghc8107` (not binary!) on `powerpc64le-linux` for now — that'll work, but be annoying as it requires building two full GHCs.

pkgs/development/compilers/ghc/8.6.5-binary.nix Outdated Show resolved Hide resolved
@sternenseemann sternenseemann marked this pull request as draft April 11, 2022 11:20
@ghost
Copy link
Author

ghost commented Apr 11, 2022

I'll convert this to a draft until #168199 is merged after which this will become useful.

Sigh. I would appreciate it if you would give an approval-review to that PR then, or at least poke the people who are in a position to merge it.

One of the most dissatisfying parts of my experience with the mips64 support was everybody kind of dodging merges/reviews by making PR X a draft until PR Y merged. This is especially frustrating since github refuses to implement any form of bug dependencies, something bugzilla has been able to do for decades...

In the mips64 case, the uber-blocker PR is this one: #166879. If you can prod people into taking that seriously you'll be my personal hero. A ton of packages vendored an impurity from libtool, and waiting for all of those packages to re-run autoconf is going take several years. Due to dumb luck the "popular platforms" don't hit the impurity, so it's hard to get people to take the problem seriously since "works for me on x86". The fix is small, simple, and confined to the affected platforms, but it's also not pretty (substituteInPlace) because the vendoring it compensates for is not pretty.

Are there plans for CI / testing / Hydra jobset already?

Western Semiconductor offered to donate about $1,000 worth of mips64 hardware to nixorg with no strings attached. @grahamc mediated discussions with the nixorg board (or whatever it is called), who initially agreed to accept the donation. I asked for a shipping address and then things sort of went to radio silence.

Anyways I am open to possibly doing something similar for power64le, but since the hardware is about 12 times as expensive (and about 100 times higher-performance) it would be a longer process to get that approved and there would be additional conditions -- like agreeing to return the hardware if it isn't being used.

@ghost
Copy link
Author

ghost commented Apr 11, 2022

Unfortunately it seems that GHC HQ's powerpc builders broke or were decomissioned some time after 8.6.5, so we can't bootstrap anything after 8.10.7 (GHC only supports the previous two major releases for bootstrapping).

With this commit ghc 8.10.7 built without any problems (compiled by 8.6.5).

We can establish a chain for now, i.e. build ``ghc902, `ghc922` and `ghcHEAD` using `ghc8107` (not binary!) on `powerpc64le-linux` for now — that'll work, but be annoying as it requires building two full GHCs.

I'm trying these now.

@ghost
Copy link
Author

ghost commented Apr 11, 2022

I'm trying these now.

Pushed a commit to allow ghc9xx to evaluate. Build in progress.

@ghost
Copy link
Author

ghost commented Apr 11, 2022

Pushed a commit to allow ghc9xx to evaluate. Build in progress.

ghc902 built successfully.

Now building ghc921.

@sternenseemann
Copy link
Member

Sigh. I would appreciate it if you would give an approval-review to that PR then, or at least poke the people who are in a position to merge it.

I'll try to look into it soon (if I'm lucky I have access to hardware to test it).

I can sympathize with your frustration. I don't think it's a technical problem (what isn't solved already should be possible to), really, which is why I think everyone is so hesitant. We need to decide whether to add support for this (these) new platform(s) and the discuss how to and to what extent maintain the support.

I must say that we have a really bad track record with decision making, the RFC process is slow and has a tendency to delve into trivial things, and apart from that it kind of hinges on individuals (this is a general problem, e.g. adding machines to Hydra can be done by a low one digit amount of people to my knowledge). For this we don't need an RFC, as there's enough precedent for platforms being added without one, but it also serves as diffusion of responsibility.

Also in my mind I am increasingly hyper-aware that adding support for something is always a risk, i.e. you are forced to keep it in that way afterwards and it is often difficult to gauge, how feasible that'll be in the future (especially with key people leaving, lack of CI / resources for testing / …). This is not to say that I'm not in favor, just to explain my (and maybe other's?) hesitation. With the improved bootstrapping by having the necessary tools cross compiled by Hydra first and the fact that it is Linux platforms we are talking about, it should be pretty safe overall.

If I recall correctly, the riscv64 support was also stuck in limbo for quite some time. I suppose mips and powerpc may also be a tougher sell / spark less interest for people (as opposed to the recent additions relating to aarch64/riscv64).

One of the most dissatisfying parts of my experience with the mips64 support was everybody kind of dodging merges/reviews by making PR X a draft until PR Y merged.

At least I can assure you that I'm very prepared to merge this PR, as it is actually my area of responsibility.

Western Semiconductor offered to donate about $1,000 worth of mips64 hardware to nixorg with no strings attached. @grahamc mediated discussions with the nixorg board (or whatever it is called), who initially agreed to accept the donation. I asked for a shipping address and then things sort of went to radio silence.

Anyways I am open to possibly doing something similar for power64le, but since the hardware is about 12 times as expensive (and about 100 times higher-performance) it would be a longer process to get that approved and there would be additional conditions -- like agreeing to return the hardware if it isn't being used.

That sounds pretty exciting.

@sternenseemann
Copy link
Member

Oh, and the commit message(s) should read haskell.compiler.ghc865Binary: … for the bindist commits when you squash at the end.

@ghost
Copy link
Author

ghost commented Apr 11, 2022

Now building ghc921.

Completed successfully.

Have to leave the office for a bit, will reply to the three comments above when I get back.

@sternenseemann
Copy link
Member

Now building ghc921.

Completed successfully.

To clarify, you mean ghc922, right? gh921 was removed a few weeks ago on master.

@ghost
Copy link
Author

ghost commented Apr 12, 2022

We need to decide whether to add support for this (these) new platform(s) and the discuss how to and to what extent maintain the support.

Using the word "support" is really misleading here.

It's an unfortunate quirk of the English language where the same word is used to describe both (a) the burden of assisting users on a given platform and (b) merging changes which allow compilation to succeed on a platform. I wish we had different words for these two things, all I want is the latter, so these changes don't have to be kept out of tree where they will get merge-conflicted by every treewide or typo on an adjacent line.

Also in my mind I am increasingly hyper-aware that adding support for something is always a risk

Maybe for Darwin or something equally weird (windows?). But I really don't see things this way for cases like mips64-linux and powerpc64el-linux where 95%+ of the changes involve adding another line to (the nixlang equivalent of) an existing "switch(architecture) { ... }" block. The rest are one-liners or else (like #166879) fix legitimate bugs that x86_64/aarch64 were unaffected by due to a lucky quirk.

If I recall correctly, the riscv64 support was also stuck in limbo for quite some time. I suppose mips and powerpc may also be a tougher sell / spark less interest for people

I bristle a bit at this.

I can order mips64 routers off of Amazon, today -- commercially profitable products with have multiple gigabit ethernet ports and optical fiber SFP cages, and I can put 8GB of ram into them. Riscv64 is still a mishmash of sold-at-a-loss dev boards and all but a very few have soldered-down ram inadequate to build nixpkgs. The best riscv64 board, the one I wanted to buy, was discontinued after a run of something like 1000 units. That was almost four years ago, and commercially-available riscv64 machines are still scarce. SiFive really wants to get out of the dev-board manufacturing business; they only do it so Debian won't delist riscv64.

I'm sure someday riscv will supplant mips. It would've been called "MIPS VI" if not for trademarks -- Dave Patterson and his collaborators created both of them! But people are ridiculously optimistic about how soon that's going to happen. It's a bit silly to see mips64 "getting osborned" by its own offspring!

@ghost
Copy link
Author

ghost commented Apr 12, 2022

Oh, and the commit message(s) should read haskell.compiler.ghc865Binary: … for the bindist commits when you squash at the end.

Done

@ghost ghost changed the title ghc: add powerpc64le bootstrap haskell.compiler.ghc865Binary: add powerpc64le bootstrap Apr 12, 2022
@ghost
Copy link
Author

ghost commented Apr 12, 2022

To clarify, you mean ghc922, right? gh921 was removed a few weeks ago on master.

No, it was actually ghc921. My personal fork of nixpkgs (which now supplies the userspace for most of my machines) is based off of 462ebd02ea333a997aff8c787651d9c7a2bcc707 which is now a month old. I will rebase sometime this week.

The main motivation for the torrential flood of PRs I've submitted the last two months was to cut down the number of patches I carry in order to be able to rebase weekly, but I'm still not quite there...

@sternenseemann
Copy link
Member

Using the word "support" is really misleading here.

It's an unfortunate quirk of the English language where the same word is used to describe both (a) the burden of assisting users on a given platform and (b) merging changes which allow compilation to succeed on a platform. I wish we had different words for these two things, all I want is the latter, so these changes don't have to be kept out of tree where they will get merge-conflicted by every treewide or typo on an adjacent line.

I do mean support in the latter sense: This still means that you have to keep it working, check for regressions, maintain or add platform specific workarounds, …

A recipe against this can be having actual downstream users that complain when things break – and probably also keeping expectations low until we know we can pull it off.

Maybe for Darwin or something equally weird (windows?).

That's true. Darwin is an incredible pain compared to e.g. aarch64-linux where the problems are usually limited to wrong wordsize assumptions or similar things.

But I really don't see things this way for cases like mips64-linux and powerpc64el-linux where 95%+ of the changes involve adding another line to (the nixlang equivalent of) an existing "switch(architecture) { ... }" block. The rest are one-liners or else (like #166879) fix legitimate bugs that x86_64/aarch64 were unaffected by due to a lucky quirk.

Adding a conditional fix for a platform is easy, but it's hard to maintain: It's often difficult to tell whether it is still necessary after an update, people updating the expression often lack the hardware to investigate; In short the risk I was alluding to are bitrot and subsequent regression of the platform support.

I bristle a bit at this.

I'm not saying that this sentiment is rational, but due to this hype more people have at least played around with the riscv64 cross-toolchain or spun something up in an emulator than is the case for more serious platforms (at least that's my impression in the context of nixpkgs).

if I'm lucky I have access to hardware to test it

Sadly on classic powerpc big endian it seems, don't remember whether you wanted to tackle that.

@ghost
Copy link
Author

ghost commented Apr 12, 2022

This still means that you have to keep it working, check for regressions, maintain or add platform specific workarounds, …

In nixpkgs, merging architectural conditionals definitely does not imply that level of support.

About half of the work it took me to get nixpkgs working on powerpc64le was already committed before I started. I'm really glad that stuff was allowed in, even though it didn't amount to working support. It made the size of my task much more manageable.

and probably also keeping expectations low until we know we can pull it off

I agree with this, and would say that this is already effectively the unwritten policy for nixpkgs. Anything I find that works on anything other than {x86_64,aarch64}-linux I consider to be an unexpected bonus!

It's often difficult to tell whether it is still necessary after an update

This is an excellent point, and since about two weeks ago I've begun being very careful to clearly state "this can be removed when..." anywhere I add architecture-specific code that isn't part of a Rosetta Stone. Here is one example. In #166879 I went so far as to have the workaround automatically insist on itself being removed as soon as it is no longer necessary.

So yes, I agree, and I would definitely support some kind of official policy that any architecture-specific code other than a Rosetta Stone entry or a bootstrap tarball hash must clearly explain under what conditions it can be removed, and checking those conditions shouldn't require access to hardware. That would be a good policy to have.

but due to this hype

:)

Sadly on classic powerpc big endian it seems, don't remember whether you wanted to tackle that.

Well, next on my list after powerpc64le is big-endian mips64. I've heard that big-endian is a real headache because so much open source software is developed on little-endian hardware.

If that turns out to be not so bad I can look at big-endian powerpc. What kinds of powerpc systems are commonly operated in big-endian mode? I don't remember when powerpc became dual-endian, but I think it was quite a while ago, wasn't it? Do people run them big-endian due to needing a big-endian kernel for some reason?

@ofborg ofborg bot added 10.rebuild-darwin: 1 10.rebuild-linux: 501+ 10.rebuild-linux: 5001+ and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 0 This PR does not cause any packages to rebuild labels Jun 27, 2022
@sternenseemann
Copy link
Member

Still waiting on #169378, right?

@ghost ghost marked this pull request as draft June 28, 2022 21:57
@ghost ghost marked this pull request as ready for review June 28, 2022 21:57
@github-actions github-actions bot added 6.topic: haskell and removed 6.topic: stdenv Standard environment labels Jun 28, 2022
@ghost
Copy link
Author

ghost commented Jun 28, 2022

How embarrassing: my last force-push was from the wrong branch on my end. I have fixed this.

Is there any way to resolve merge conflicts on github without doing a force-push, and without having to use their text-editor-in-the-web-browser? Force-pushes are so error-prone...

With git format-patch/git am, thing being uploaded is a range of commit-hashes rather than just the tip. I guess github must not have this, because if it did we wouldn't need to set the "base branch" manually via the web UI.

I've started draftifying before each force-push to prevent mistakes from causing mass-pings.

Still waiting on #169378, right?

~~It looks like that is no longer a prerequisite for this PR, when cherry-picked onto master -- probably due to version bumps in binutils or glibc since it was opened.

I am re-verifying at staging.~~

Sadly, #169378 is still required in order for stdenv to build on powerpc64le-linux.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 0 This PR does not cause any packages to rebuild and removed 10.rebuild-linux-stdenv This PR causes stdenv to rebuild 10.rebuild-darwin: 1 10.rebuild-darwin: 1-10 10.rebuild-linux: 501+ 10.rebuild-linux: 5001+ labels Jun 28, 2022
@ghost
Copy link
Author

ghost commented Jun 30, 2022

Sadly, #169378 is still required in order for stdenv to build on powerpc64le-linux.

#169378 has merged, so this PR is now usable.

@ghost ghost marked this pull request as draft June 30, 2022 10:58
@ghost ghost marked this pull request as ready for review June 30, 2022 10:59
@ghost
Copy link
Author

ghost commented Jun 30, 2022

Rebased.

@sternenseemann sternenseemann merged commit 86c7c09 into NixOS:staging Jul 1, 2022
@ghost
Copy link
Author

ghost commented Aug 21, 2022

After using this for the past few months I continue to encounter intermittent segfaults from the downloads.haskell.org-supplied GHC binaries for 8.6.5. The segfaults are extremely nondeterministic. I've always been able to proceed by simply retrying (never more than 10 attempts).

Once that compiler builds a ghc-8.10.7 from source, everything is completely reliable.

Unfortunately that 8.6.5 binary is the only powerpc64le binary I can find on downloads.haskell.org.

@sternenseemann, would it be acceptable for me update this expression to patchelf a binary from Debian? If so, are there any constraints I should observe while attempting to make this work, like "must be older/newer than version X"? My instinct would be to start with the oldest binary I can find which nixpkgs is able to use, in order to maximize the number of GHC versions which can be built from it.

I haven't tried this yet, but wanted to check if there would be any policy obstacles before I sink a bunch of time into this.

@sternenseemann
Copy link
Member

Using binaries from other distributions for bootstrapping purposes is acceptable and has precedent in nixpkgs (in the case of GNAT).

Usually we try to avoid bootstrapping chains to minimize the time we need to spend on bootstrapping a compiler on a stdenv rebuild, since their builds are bottlenecks by nature. However, in this case, I'd say it's up to you – we don't build powerpc64le on Hydra and don't have many support requirements for the platform. Packaging 8.6.* would allow us to build any GHC in nixpkgs after a long chain, whereas packaging 8.10.* would mean that it's much quicker to build the default GHC.

As a side note, you may want to consider making a separate GHC derivation for the debian ones, if there is not much code to be shared with the regular bindists from GHC HQ. I guess in the case of a debian binary, you wouldn't have a configure script etc.

@ghost
Copy link
Author

ghost commented Sep 24, 2022

would it be acceptable for me update this expression to patchelf a binary from Debian?

So I've been struggling with this and it is not really looking like a net win at the moment.

Debian doesn't package anything other than 8.0.1, 8.4.4, 8.8.4, and 9.0.2, so there won't be any meaningful reduction in the bootstrap-chain length. They don't have 8.10.x binaries, nor 9.4.x.

Trying to get Debian's binaries to work inside nixpkgs is getting pretty ugly. I had no idea how many places ghc writes absolute paths into its install... and glibc is not really cooperating (missing GLIBC_PRIVATE symbols, etc).

I think leaving the flaky 8.6.5 bootstrap is the best we can do unless GHC HQ is willing to post a blessed build of something more recent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: haskell 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 0 This PR does not cause any packages to rebuild
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants