Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully static Haskell executables - overview issue #43795

Open
48 of 60 tasks
nh2 opened this issue Jul 19, 2018 · 204 comments
Open
48 of 60 tasks

Fully static Haskell executables - overview issue #43795

nh2 opened this issue Jul 19, 2018 · 204 comments

Comments

@nh2
Copy link
Contributor

nh2 commented Jul 19, 2018

If you just want to build static Haskell executables right now, follow these instructions.

This issue collects/links all issues and ongoing work to make fully-static building of Haskell executables an excellently-working feature in nixpkgs.

I will update here regularly the progress we are making, and sub-goals we encounter on the way that need to be fulfilled to tick off the parent goal.

Obviously contributions to this list are welcome.

Why is static linking desirable?

Static linking means your executable depends on as few things running on the target system as possible.

For Linux, it means that an executable you build will work on any Linux distribution, with essentially infinite forwards-compatibility (because Linux in general does not change its system call interface).

Statically linked executables do not depend on the target system's libc.

Statically linked executable are incredibly easy to deploy (many Go executables are statically linked, which gave them the reputation of being easy to deploy in the devops world).

Statically linked executables can start faster than dynamically linked ones where dynamic linking has to be performed at startup (of course the startup time depends on the number of libraries to link and the amount of symbols contained within).

Statically linked executables are smaller (up to 4x in our mesurements), mostly because GHC's -split-sections has way more effects with them (for unknown reason for now).

Who is working on or contributing to this

Feel free to add yourself!

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

Related:

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

CC @fosskers from here

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

CC @vaibhavsagar whose blog post got me into working on this

@nh2 nh2 mentioned this issue Jul 19, 2018
9 tasks
@puffnfresh
Copy link
Member

I know NixOS is careful about licenses and distribution. I'm thinking about libgmp, which is LGPL:

https://www.gnu.org/licenses/gpl-faq.en.html#LGPLStaticVsDynamic

(1) If you statically link against an LGPL'd library, you must also provide your application in an object (not necessarily source) format, so that a user has the opportunity to modify the library and relink the application.

(2) If you dynamically link against an LGPL'd library already present on the user's computer, you need not convey the library's source. On the other hand, if you yourself convey the executable LGPL'd library along with your application, whether linked with statically or dynamically, you must also convey the library's sources, in one of the ways for which the LGPL provides.

Can we ensure either one of these if we enable caching of static Haskell executables?

@Gabriella439
Copy link
Contributor

@puffnfresh: Or you could use a version of GHC with simple-integer to avoid the GMP dependency

@puffnfresh
Copy link
Member

@Gabriel439 yeah, is that what we'll do for redistributing static Haskell executables?

@vaibhavsagar
Copy link
Member

This is great, thanks so much for working on this and creating this issue @nh2!

@dezgeg
Copy link
Contributor

dezgeg commented Jul 19, 2018

Before doing more of this #43524 -style fixing of individual package builds, could someone give a try of https://github.com/endrazine/wcc (which has been packaged in #43014) in converting shared objects to static objects?

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

@puffnfresh Every Haskell package already has a license field (upstream and it's available as a field in nixpkgs).

The best solution seems to be to traverse the licenses of all things linked and ensure that Hydra does not build statically linked exes where one of the dependencies is LGPL or stronger (unless the final project itself is also LGPL or stronger anyway).

For executables blocked that way, a fallback should happen where it's checked if the issue goes away if integer-simple was used instead; then Hydra can build it that way.

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

Before doing more of this #43524 -style fixing of individual package builds, could someone give a try of https://github.com/endrazine/wcc (which has been packaged in #43014) in converting shared objects to static objects?

@dezgeg It sounds interesting and may be worth a try but there may be the independent questions whether I want "binary black magic" (as wcc calls it) in my production binaries.

Nix is already pretty custom in its build process, I wouldn't want an upstream library author reject my bug report about a segfault with "well you use totally custom tools, if you just used our normally built static libs we would be more eager to look into that". It's a good feature if nix's builds still essentially are a series of "normal" build steps (compiler and linker invocations).

I haven't looked into wcc in detail so I cannot judge its safety, but my intuition tells me I'd rather sponsor Hydra some build servers and HDs than turn dynamic libs into static ones and have the risk of unexpected behaviour.

@jgm
Copy link

jgm commented Jul 19, 2018

We already provide fully static builds of pandoc with each release, using docker and alpine.
The build process can be found in the linux/ directory in pandoc's repository.

@fosskers
Copy link

fosskers commented Jul 19, 2018

Thank you for CCing me into this. Could Aura be added to the list of projects above? I haven't yet been able to decipher the instructions for testing it myself, apologies.

@domenkozar
Copy link
Member

I just want to thank you @nh2 this is excellent work :)

@nh2
Copy link
Contributor Author

nh2 commented Jul 19, 2018

@fosskers Does aura have a nix derivation in hackage-packages that I could use as a base?

Currently most executables I'm trying are already in nixpkgs and then I just override them with statify function to build statically.

I haven't yet been able to decipher the instructions for testing it myself, apologies.

Conceptually very simple, for example for dhall I just added 1 line here:

https://github.com/nh2/static-haskell-nix/blob/ef283274ce193f713082591dd462f4bd3fb4dd1f/survey/default.nix#L98

and then built with

NIX_PATH=nixpkgs=https://github.com/nh2/nixpkgs/archive/925aac04f4ca58aceb83beef18cb7dae0715421b.tar.gz nix-build --no-link survey/default.nix -A working

For some packages it works immediately, for others I have to make some small overrides of their dependencies (see a bit further up in the file), usually to give static libraries for system dependencies.

@infinisil
Copy link
Member

Having discussed it on IRC, here's a way of building all Haskell executables:
https://gist.github.com/Infinisil/3bdb01689b5f84b71f8538f467159692

Just nix-build that file. This includes broken packages by default, because if you want to see which new packages your change breaks, you need to know which ones were broken already (so you should build this once before your change and once after, then compare).

@nh2
Copy link
Contributor Author

nh2 commented Jul 20, 2018

Having discussed it on IRC, here's a way of building all Haskell executables

I have incorporated (a large part of) that into https://github.com/nh2/static-haskell-nix/blob/09d0eaa605111ea516dfaa0e7341a71ff1a63042/survey/default.nix#L47-L124 now.

So now we can build survey/default.nix -A allStackageExecutables and see (with --keep-going) how many of those build.

If some binary doesn't build because of dependent libraries making problems, those libraries are supposed to be patched here.

Contributors are welcome to help make as many of those build as possible.

@fosskers
Copy link

fosskers commented Jul 20, 2018

@nh2 Aura isn't yet connected to Nix infrastructure in any way. I suppose I'll pull up my old notes about getting a project set up with Nix and try to build Aura that way at first.

@dezgeg
Copy link
Contributor

dezgeg commented Jul 20, 2018

It sounds interesting and may be worth a try but there may be the independent questions whether I want "binary black magic" (as wcc calls it) in my production binaries.

Nix is already pretty custom in its build process, I wouldn't want an upstream library author reject my bug report about a segfault with "well you use totally custom tools, if you just used our normally built static libs we would be more eager to look into that". It's a good feature if nix's builds still essentially are a series of "normal" build steps (compiler and linker invocations).

I understand the concern, yes.

my intuition tells me I'd rather sponsor Hydra some build servers and HDs than turn dynamic libs into static ones and have the risk of unexpected behaviour.

The problem isn't the disk space or resource usage but rather the effort of fixing packages one-by-one to support static linking and the cost of maintaining that. But maybe Haskell packages don't use too many native dependencies.

@nh2
Copy link
Contributor Author

nh2 commented Jul 20, 2018

I have given a first run at building all executables on Stackage, statically.

https://github.com/nh2/static-haskell-nix/blob/09d0eaa605111ea516dfaa0e7341a71ff1a63042/survey/default.nix#L257-L259

See this post for full build outputs.

It took around 3 hours to get there (I built with 5 machines).

The final status line, [2/961/963 built (49 failed), 4245 copied (19215.1 MiB), 210.7 MiB DL] already gives us some information on the success.

The program didn't terminate; right now it's

  • stuck on building darcs-2.14.1
  • dist/build/test-courier/test-courier from the courier package is stuck on a 100% CPU loop for the last 11 hours

Insights:

  • 6 of the 49 failed executables failed because of generic-haskell-builder: Overridden Cabal package to be used by Setup.hs is ignored #43849
    • this can be fixed by either putting Cabal = Cabal_patched; into the haskellPackagesWithLibsReadyForStaticLinking overrides (I would like to avoid it that unpatched Cabal can be used for most packages)
    • or putting useFixedCabal on all the dependencies used in Setup.hs of those 6 packages (as described in the linked ticket, it can be very cumbersome to figure out which are those dependencies)
  • 15 failed because of cannot find -l*, so static libs are missing for those libraries. These are:
    • The libraries in question are:
      • bz2
      • crypto
      • curl
      • expat
      • girepository-1.0
      • glib-2.0
      • gobject-2.0
      • mpfr
      • nettle
      • pcre
      • pq
      • ssl
      • xml2
    • That's not a lot, fixes for this should be done in this section similar to the other libs we already have overridden there.

@nh2
Copy link
Contributor Author

nh2 commented Jul 20, 2018

But maybe Haskell packages don't use too many native dependencies.

@dezgeg Yes, I think that is accurate.

From what I posted a just above, it looks like most of Stackage's executables will be buildable as long as a set of 15 native libs (13 above and sqlite and lzma which I already have done) are overridden with static support.

@matthewbauer
Copy link
Member

Great work!

My rule of thumb for static vs shared is:

  • If you're targeting something outside of a Nix store - build static
  • If you're targeting inside of a Nix store - build shared

We pretty much need to support both in Nixpkgs. I can see some of the benefits of always building statically but I think the advantages to shared linking is much greater.

@nh2
Copy link
Contributor Author

nh2 commented Jul 20, 2018

I've found and PRd a fix for another cabal issue that needs to be merged to make static linking reasonable:

haskell/cabal#5451

This passes --ld-option through to GHC so that we can specify extra options in configureFlags that are needed only for static linking.

I've also added it to the overview in the issue description.

@nh2
Copy link
Contributor Author

nh2 commented Jul 20, 2018

My rule of thumb for static vs shared is:

If you're targeting something outside of a Nix store - build static
If you're targeting inside of a Nix store - build shared

This makes sense.

We pretty much need to support both in Nixpkgs

Yes. One of my goals is that nixpkgs becomes the building environment which makes it really easy to build any program so that it works on any Linux distribution, forever.

Other Linux distributions make it really hard to build things statically.

I can see some of the benefits of always building statically but I think the advantages to shared linking is much greater.

I guess I agree in general but there are some exceptions / other points, like

  • Haskell libraries are already statically linked today in nixpkgs and because
    • this makes for much better dead-code elimination (this may also apply to other languages)
    • I've seen dynamically-linked Haskell programs with 100s of .so dependencies take up to 2 seconds to show their --help text
  • Nix(OS) users don't really benefit from the "dependencies can be upgraded cheaply" idea (e.g. patching a libc vulnerability takes only a tiny libc update on other distros) because with nix all downstream dependencies are rebuilt and re-shipped anyway even for the smallest change.
  • So they benefit only from the "dependencies can be shared" idea of dynamic linking.

I'd find it very cool if somebody could build a typical NixOS with only static exes and compare what the size difference (and perhaps resident memory difference) is.

Gabriella439 added a commit to dhall-lang/dhall-haskell that referenced this issue Jul 21, 2018
This adds a new `dhall-static` target that builds a fully static `dhall`
executable that can be run on any Linux machine (i.e. it is a relocatable
executable that is completely dependency free).  That in turns implies
that even though it is built with Nix it doesn't require that the user
installs Nix to run it (i.e. no dependency on the `/nix/store` or a
Nix installer).  Just copy the standalone executable to any Linux  machine
and it's good to go.

This based on the following work of @nh2:

* NixOS/nixpkgs#43795
* dhall-lang/dhall-lang#192 (comment)

This also bumps the version of `nixpkgs` used for the normal (non-static)
Dhall build to be the closest revision on `nixpkgs` `master` as the one
used by @nh2 in his work.  Once that work is merged into `nixpkgs` `master`
then both builds can use the same revision from `nixpkgs` `master`.
@nh2
Copy link
Contributor Author

nh2 commented Apr 7, 2022

I have created @NixOS/static team for people interested in static builds in nixpkgs.

I have started started to invite some people that I think are interested: https://github.com/orgs/NixOS/teams/static/members

Sonebody please tell me if it is OK to add non-committers to such teams, and whether the Request to join button is available to everybody, or just some subset of Github users (org members, committers?).

Edit: I now found the nixpkgs-committers team, so other teams do NOT necessarily get commit access, and it's safe to add anyone to the static team. If you find I'm slow adding you, ping me again! :)

@sternenseemann
Copy link
Member

Maybe a good opportunity to note some recent movement in upstream nixpkgs:

This is quite a step in nixpkgs, as it is now possible to compile non-trivial applications like niv fully statically using pkgsStatic (after disabling separate bin outputs which cause trouble with Paths_ outputs). Template Haskell appears to work, although it will fail as soon as loading system libraries via TH becomes necessary — this includes GMP which is why the integer-gmp backend is not very useful.

The implementation and investigation was done by @rnhmjoj, thank you!

@nh2
Copy link
Contributor Author

nh2 commented Apr 7, 2022

@sternenseemann @rnhmjoj I received a question today whether it would make sense to move all .a files in nixpkgs to .static outputs.

It sounds good to me, but would there be any problems with having to list .static outputs explicitly in dependencies, or with automatic finding of e.g. pkg-config dependencies in package sets such as pkgsStatic?

@sternenseemann
Copy link
Member

Doing that would effectively break static linking as it works today, since we assume that the libraries are to be found in foo.lib or foo.out in general.

I assume the idea is to install static archives in static in addition to dynamic libraries in lib for which there's precedent with glibc. I guess that would be possible, but I think it'd need extra motivation or we'd just inflate the size of the binary cache for no reason. It would only be possible to use it very manually (by specifying the outputs explicitly) which would be kind of clunky.

In general our notion of platforms is pretty strict — it's either dynamic or static (of course some language toolchains like to link their dependencies statically like Go and Haskell); Eventually I feel like we may be forced to devise a way to have both, for cases like Haskell which may want to load dynamic libraries (in TH) despite linking statically.

I guess to respond properly I would need to understand better what this idea envisions exactly?

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Apr 7, 2022

@sternenseemann @rnhmjoj I received a question today whether it would make sense to move all .a files in nixpkgs to .static outputs.

I always though the standard package set is supposed to contain only dynamically linked binaries and shared objects. So, static libraries could go either to a .dev argument or accessible via the pkgsStatic overlay. Anyway, it doesn't look like there's a consensus on this.

See also: #164141

@dark-ether
Copy link
Contributor

just for curiosity, even if each individual statically linked executable is smaller as each statically linked executable contains all its dependencies won't they eventually be bigger than dynamically linked ones as in that case you install the dependency only once? if i am right how many would be the threshold?

@sternenseemann
Copy link
Member

Statically linked Haskell executables are smaller than the sum of their parts – or rather dependencies. My guess would be that it is to a degree where it would take a lot of installed packages for the deduplication effect outweighing the save, but that would require some testing.

Other factors are that the store paths of libraries tend to be quite big:

  • We ship the same library basically three times: Shared object, static archive, profiling archive. The reason why we don't optimize for space efficiency here because we know end user software won't reference this usually. The profiling builds do save Haskell developers looking to use nixpkgs a lot of build time though.
  • GHC can't be separated from the core libraries properly, so dynamically linking any Haskell packages would incur a baseline cost of referencing GHC which is ~1GB.
  • Libraries can't be properly separated from their documentation due to some absolute references. So linking against a library also requires downloading its documentation unconditionally.

These factors mean that dynamic linking is atrocious and only really feasible in cases where you need all the dependencies and GHC anyways (e.g. HLS). There is still some optimization potential in splitting the derivations up more, into smaller derivations (for GHC if the build system allows it eventually) and into smaller outputs for ordinary packages that don't reference each other which is the trouble so far.

@nh2
Copy link
Contributor Author

nh2 commented Jul 13, 2023

Update:


* We don't aim for 100% because some need OpenGL and that's currently impossible.

@nomeata
Copy link
Contributor

nomeata commented Jul 14, 2023

Improved README that explains relation to pkgsStatic

Thanks for that! I keep forgetting these details, it’s good to have them written down.

Is it worth adding “If the package you care about builds statically using nixpkgs’s pkgStatic, just use that”, or would there be benefits from using static-haskell-nix even then?

@nh2
Copy link
Contributor Author

nh2 commented Jul 16, 2023

Is it worth adding “If the package you care about builds statically using nixpkgs’s pkgStatic, just use that”

I think that's probably the case.

There might be minor differences, e.g. static-haskell-nix might have some more overrides for system libraries to make some additional features work, but most Haskell applications probably don't care about those. Other differences might be integer-simple vs integer-gmp, I haven't checked which one pkgsStatic currently useses, and for static-haskell-nix it's a top-level option.

@sternenseemann
Copy link
Member

pkgsStatic has to use native-bignum because the flaw of GHC's RTS linker is that it can only load static archives of Haskell libraries—if we'd use gmp, TemplateHaskell wouldn't work at all.

@avanov
Copy link

avanov commented Jul 17, 2023

Is there an established interface for mixing pkgsStatic with pkgsCross? This would be helpful in cross-compiling production Linux AMD64 binaries on ARM Macs without having to fall back to Docker ecosystem.

@ShamrockLee
Copy link
Contributor

Is there an established interface for mixing pkgsStatic with pkgsCross?

Isn't pkgsCross already static?

@avanov
Copy link

avanov commented Jul 17, 2023

@ShamrockLee you are right, and I wasn't precise, I mean static-haskell-nix as a temporary replacement for pkgsStatic for packages that don't compile with pkgsStatic, i.e. cross-compiling with static-haskell-nix layers propagated to pkgsCross instead of pkgsStatic in the following snippet:

{ nixpkgs ? fetchTarball "https://github.com/NixOS/nixpkgs/archive/bba3474a5798b5a3a87e10102d1a55f19ec3fca5.tar.gz"
, pkgs ? (import nixpkgs {}).pkgsCross.musl64
}:

# ??? Does the following `pkgsStatic` have to be replaced ???
pkgs.pkgsStatic.callPackage ({ mkShell, zlib, pkg-config, file }: mkShell {
    # ...
}) {}

@Artturin
Copy link
Member

Is there an established interface for mixing pkgsStatic with pkgsCross? This would be helpful in cross-compiling production Linux AMD64 binaries on ARM Macs without having to fall back to Docker ecosystem.

You can get any combination you want by using crossSystem

@Atemu
Copy link
Member

Atemu commented Jul 17, 2023

You can also just do this: pkgsCross.<...>.pkgsStatic.hello.

@bgamari
Copy link
Contributor

bgamari commented Dec 19, 2023

Note: Currently static GHC 9.8.1 builds are broken due to #275304 .

@nh2
Copy link
Contributor Author

nh2 commented Jun 15, 2024

Update:

  • New release of static-haskell-nix on top of nixos-24.05. See release details.
  • Main PR: Nixos 24.05 nh2/static-haskell-nix#127
    • Lots of fixes upstreamed to nixpkgs and upstream software (list)
    • Less manual overrides: Automatic setting of dontDisableStatic
  • New binary cache
  • Improved README, with open questions that need help
  • Successfully builds 367 Stackage executables statically, fails on 23.
    • That's 94% of Stackage executables! *

* We don't aim for 100% because some need OpenGL and that's currently impossible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests