Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document why hls binaries are statically linked (and track ghc changes to allow loading dynamic libs within a static executable) #2000

Open
nh2 opened this issue Jul 5, 2021 · 40 comments
Labels

Comments

@nh2
Copy link
Member

nh2 commented Jul 5, 2021

I learned from @mpickering that HLS distributes GHC binaries with DYNAMIC_GHC_PROGRAMS=YES, and that this creates various problems for TemplateHaskell regarding dynamic linking.

I am interested in this topic because of NixOS/nixpkgs#129245.

I was trying to find documentation on the fact that, or why, HLS does that, but couldn't find any.

Could that be written down somewhere? It would make users' and packagers' life easier.


On the question itself: Would DYNAMIC_GHC_PROGRAMS=YES be more problematic to distribute somwhow? It appears to me that if e.g. the VSCode plugin downloader can download GHC, it could also download some .so files with it.

@pepeiborra
Copy link
Collaborator

pepeiborra commented Jul 5, 2021

There seems to be some confusion here. DYNAMIC_GHC_PROGRAMS is a ghc flag and has nothing to do with HLS.

I think you are referring to the fact that distribution HLS binaries are linked statically, that is, both Haskell packages and external libraries (modulo glibc) are statically linked in a single binary. This is done for ease of distribution.

The problem comes because the ghc package is linked statically, which means the runtime the RTS linker will be used to dynamically load packages and their external dependencies, as documented here. Unfortunately the RTS linker is far from perfect and has issues in all platforms.

GHC binaries switched to DYNAMIC_GHC_PROGRAMS=YES in 7.8/7.10, except on Windows where I think they are still statically linked. This was done for good reason: TH and GHCi are able to use dlopen to load packages and libraries.

Ideally, the choice of linker/loader would be decoupled from whether the GHC package is linked statically or dynamically. I remember asking @bgamari at some point to write down what needs to be done in a ticket for future reference.

@nh2
Copy link
Member Author

nh2 commented Jul 5, 2021

There seems to be some confusion here. DYNAMIC_GHC_PROGRAMS is a ghc flag and has nothing to do with HLS.

I think you are referring to the fact that distribution HLS binaries are linked statically

Yes, that's what I meant with "HLS distributes GHC binaries with DYNAMIC_GHC_PROGRAMS=YES".

as documented here

Thanks for linking me that page. It occurred to me now that that the "What happens if we build a static executable and link in the GHC package?" mentioned in there also applies to the ghc binary itself.

This is done for ease of distribution.

Could you elaborate in which way it makes distribution easier? Is it just have less files around, or are there othere benefits?

Thanks!

@pepeiborra
Copy link
Collaborator

There seems to be some confusion here. DYNAMIC_GHC_PROGRAMS is a ghc flag and has nothing to do with HLS.
I think you are referring to the fact that distribution HLS binaries are linked statically

Yes, that's what I meant with "HLS distributes GHC binaries with DYNAMIC_GHC_PROGRAMS=YES".

as documented here

Thanks for linking me that page. It occurred to me now that that the "What happens if we build a static executable and link in the GHC package?" mentioned in there also applies to the ghc binary itself.

This is done for ease of distribution.

Could you elaborate in which way it makes distribution easier? Is it just have less files around, or are there othere benefits?

Thanks!

Yes

@nh2
Copy link
Member Author

nh2 commented Jul 5, 2021

Ah sorry, I just noticed what introduced confusion. I wrote:

I learned from @mpickering that HLS distributes GHC binaries with DYNAMIC_GHC_PROGRAMS=YES,

but I meant to write:

I learned from @mpickering that HLS distributes GHC binaries with DYNAMIC_GHC_PROGRAMS=NO

(that the GHC binaries distributed by HLS use NO).

If the main benefit to HLS of using DYNAMIC_GHC_PROGRAMS=NO is "less files", wouldn't the fact that TH does not work well with it be a big motivator to switch to YES instead, like the official GHC bindists?

More files seem like an OK tradeoff to make vs wrestling dynamic loader problems.

@pepeiborra
Copy link
Collaborator

I simply don't have the expertise to set up dynamically linked binaries for distribution (other than a simple Docker image). It would be awesome if someone else can make that happen.

@hasufell
Copy link
Member

hasufell commented Aug 3, 2021

I consider this a GHC bug and I'd advise to pursue an upstream fix before going down the road of dynamic linking on linux. Static linking is a huge win for compatibility issues. Ultimately, we want GHC statically linked, so stack, ghcup etc don't have to guess which bindist is compatible with distro X version Y.

@adamse
Copy link
Contributor

adamse commented Aug 25, 2021

@hasufell which part do you consider a ghc bug?

@hasufell
Copy link
Member

@adamse

As @pepeiborra said

Ideally, the choice of linker/loader would be decoupled from whether the GHC package is linked statically or dynamically. I remember asking @bgamari at some point to write down what needs to be done in a ticket for future reference.

@adamse
Copy link
Contributor

adamse commented Aug 25, 2021

Ah, that's a good point. I've thought about a little bit, and I think it should be possible :)

@pepeiborra
Copy link
Collaborator

Are you planning to work on this Adam? That would be awesome

@adamse
Copy link
Contributor

adamse commented Nov 5, 2021

I've made a proof of concept at https://gitlab.haskell.org/ghc/ghc/-/commit/7e77131440115c10cf3b7d3b14b1cce57a5dc21f

build with

./hadrian/build --flavour=default+no_dynamic_ghc stage2:exe:ghc-bin stage1.ghc-bin.ghc.link+="-optl=-export-dynamic -debug" 
./hadrian/build --flavour=default+no_dynamic_ghc _build/ghc-stage2 stage1.ghc-bin.ghc.link+="-optl=-export-dynamic -debug" 

you should then be able to observe it loading Haskell .so's in the debug output, example command:

./_build/ghc-stage2 -e 'import Data.Text' -e 'import Data.Text.IO as T' -e 'T.putStrLn (pack "hej")' +RTS -Dl

I think using -export-dynamic is a bit too big and proper solution would use --dynamic-list with the RTS symbols like I try in https://github.com/adamse/dynload-poc/blob/main/Makefile.

And ofc the GHC changes need some thought for what the user interface should look like.

@pepeiborra
Copy link
Collaborator

Right, so with this GHC patch, if we build HLS with --export-dynamic/--dynamic-list, then the static HLS binary will use the system linker to load THs?

@adamse
Copy link
Contributor

adamse commented Nov 5, 2021

Yes, I think that should work.

@hasufell
Copy link
Member

hasufell commented Nov 5, 2021

Can someone backport that patch to 8.10.7 or 9.0.1 so we can test?

@adamse
Copy link
Contributor

adamse commented Nov 5, 2021

@adamse
Copy link
Contributor

adamse commented Nov 6, 2021

For discussion how we proceed in GHC if it turns out to work for HLS: https://gitlab.haskell.org/ghc/ghc/-/issues/20628

@adamse
Copy link
Contributor

adamse commented Nov 19, 2021

do we have any reproducer for th issues that i can try this with?

@adamse
Copy link
Contributor

adamse commented Nov 20, 2021

I tried the reproducer from #1982 (https://github.com/aschmois/th-pq-segfault) with my latest GHC patch (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7037).

I did not manage to build hls/ghcide with ghc head but I used ghci as a reasonable proxy for how hls uses the ghc api. I was able to successfully load the reproducer.

@pepeiborra
Copy link
Collaborator

Could you build static HLS 1.5.0 binaries for Mac OS with your patched GHC 8.10.7 and upload the binaries somewhere? Then @jneira can share them with users for testing.

@jneira
Copy link
Member

jneira commented Nov 20, 2021

nice!

I used ghci as a reasonable proxy for how hls uses the ghc api

I am not sure 100% but iirc some of the issues were not reproduced in ghci. Did you check that ghci failed in that reproducer with a standard ghc? thanks!

@adamse
Copy link
Contributor

adamse commented Nov 20, 2021

yes, I managed to repro the issue (or an issue...) without my new flag: https://github.com/adamse/th-pq-segfault#readme (note I had linked the debug RTS so we actually get an error message instead of a segfault)

@adamse
Copy link
Contributor

adamse commented Nov 20, 2021

@pepeiborra I will build 8.10 and HLS with that, but I can only do Linux.

@jneira
Copy link
Member

jneira commented Nov 21, 2021

oh yeah, maybe would be too much CI abuse open a mr against the 9.0 branch? it would be good to have that change backported anyways

@adamse
Copy link
Contributor

adamse commented Nov 21, 2021

I made a backport to the 8.10 branch: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7039, hopefully CI will produce some bindists.

HLS changes required here: https://github.com/adamse/hls

@pepeiborra
Copy link
Collaborator

HLS changes required here: https://github.com/adamse/hls

It might be worth submitting a PR using CPP to guard the change, like:

#if defined(GHC_PATCHED_UNBOXED_BYTECODE)
= BCOLinkable
#else

@jneira
Copy link
Member

jneira commented Jan 31, 2022

@adamse hi! not sure if you read the comment above about the CPP guard to use your patch (#2000 (comment))
any progress on th ghc side? i've just check https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7037 is not merged yet
please let us know if we could help in any way to get it merged
thanks!

@mpickering
Copy link
Contributor

@jneira The patch isn't merged because it doesn't appear to be functionally correct (ie the testsuite the doesn't pass).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants