Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around too-large LLVM files #7

Closed
dscho opened this issue Apr 2, 2023 · 9 comments
Closed

Work around too-large LLVM files #7

dscho opened this issue Apr 2, 2023 · 9 comments
Assignees

Comments

@dscho
Copy link
Member

dscho commented Apr 2, 2023

Latest night's sync failed with one error and two warnings:

[...]
+ git push origin refs/heads/main
remote: warning: File clangarm64/bin/llvm-exegesis.exe is 78.99 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB        
remote: warning: File clangarm64/bin/libclang-cpp.dll is 52.36 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB        
remote: error: Trace: 309641135192764786c9dc4a0d2bbb1accf9f8214e1fc5d81d84470a67062f34        
remote: error: See https://gh.io/lfs for more information.        
remote: error: File clangarm64/bin/libLLVM-16.dll is 103.60 MB; this exceeds GitHub's file size limit of 100.00 MB        
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com./
[...]

Unfortunately, there is no way around the 100MB restriction (other than going the Git LFS route).

One possibility to address this that comes to my mind is to store a compressed libLLVM-16.dll.gz (it compresses down to 34MB via gzip-1) and uncompress it via /etc/profile.d/<something> (and add the uncompressed file to .gitignore).

/cc @dennisameling

@dennisameling
Copy link
Collaborator

Hmm - that's unfortunate. Is Git LFS totally out of the question here? It seems to be a good solution for this use case. Not sure what that'd mean in terms of performance and pricing though. The free plan seems to be rather limited.

One possibility to address this that comes to my mind is to store a compressed libLLVM-16.dll.gz (it compresses down to 34MB via gzip-1) and uncompress it via /etc/profile.d/<something> (and add the uncompressed file to .gitignore).

That sounds a bit painful, as logic will need to be updated in various places for compressing an decompressing it. And since there's basically two ways to update this repo locally (through a git pull or update-via-pacman.ps1), it just feels to me that there's too many moving parts involved that might also break in the future. WDYT?

@dscho
Copy link
Member Author

dscho commented Apr 3, 2023

Is Git LFS totally out of the question here?

I'd rather avoid it, TBH.

it just feels to me that there's too many moving parts involved that might also break in the future. WDYT?

Makes sense. So I looked for a different solution and thought I found something in UPX (which is available in MSYS2). It provides a way to compress .exe and .dll files such that they are automatically decompressed when they're being loaded.

However, it seems that it does not like the libLLVM-16.dll file I threw at it:

$ upx.exe --verbose libLLVM-16.dll
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2023
UPX 4.0.2       Markus Oberhumer, Laszlo Molnar & John Reiser   Jan 30th 2023

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
upx: libLLVM-16.dll: CantPackException: can't pack new-exe

Packed 0 files.

The explanation seems to be here: upx/upx#650 (support for win64/arm64* is labeled as help-wanted).

@dennisameling
Copy link
Collaborator

Yes, they confirmed it's not going to work without that support being added (even when using the amd64 host executable).

Just had a quick look at their codebase, but that stuff goes far beyond my knowledge, so I'm afraid I can't be of much help here 😞

@dscho
Copy link
Member Author

dscho commented Apr 4, 2023

@dennisameling maybe strip libLLVM-16.dll brings it down to size?

dscho added a commit that referenced this issue Apr 25, 2023
GitHub does not allow pushes that include files larger than 100MB. But
LLVM's `libLLVM-16.dll` blasts right through that limit.

Ideally, we would use UPX (https://upx.github.io/) to compress this DLL
file down to size, but UPX is still waiting for Windows/ARM64 support
(see upx/upx#650).

So let's do the next best thing and use `strip.exe` to cut it down to
size.

Sadly, we cannot use the already-available `strip.exe` to do the job
because binutils _also_ is still waiting for Windows/ARM64 support.

To add insult to injury, we cannot even use the already-installed
`llvm-strip.exe` because it is a Windows/ARM64 executable and we're
running the `sync` workflow on hosted runners (which does not include
Windows/ARM64 ones, yet).

So let's bite the apple and install the x86_64 flavor of
`llvm-strip.exe` and use _that_.

This addresses
#7.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member Author

dscho commented Apr 25, 2023

@dennisameling maybe strip libLLVM-16.dll brings it down to size?

It does.

Closing via 4e626dd and https://github.com/git-for-windows/git-sdk-arm64/actions/runs/4794555349.

@dscho dscho closed this as completed Apr 25, 2023
@dscho dscho self-assigned this Apr 25, 2023
@dscho
Copy link
Member Author

dscho commented Apr 25, 2023

@dennisameling note: this is but a band aid. The libLLVM-16.dll file currently weighs 104MB before stripping, 95MB after. It looks as if it's only a matter of time until even the stripped version will blast through the limit. So I think I'd still like to pursue the UPX route, even if I definitely lack the time to do it on my own. Would you be game for a pairing session where we scope out the "T shirt size" of how big of a project it would be to teach UPX what we need it to do?

@rimrul
Copy link
Member

rimrul commented Apr 25, 2023

note: this is but a band aid. The libLLVM-16.dll file currently weighs 104MB before stripping, 95MB after. It looks as if it's only a matter of time until even the stripped version will blast through the limit.

I might have started work on a potential longer term solution at
git-for-windows/MINGW-packages#75

MSYS2 builds a pretty versatile libLLVM that can be used for a lot of cross compilation targets. We could probably build a smaller libLLVM that's more focussed towards our needs. But I don't have any reliable numbers yet.

@rimrul
Copy link
Member

rimrul commented Jun 26, 2023

I don't have any reliable numbers yet.

I have managed to build an x86_64 LLVM based on upstreams current 16.0.5. It doesn't package properly, yet, but it did produce numbers.

Upstream as is (downloaded and extracted mingw-w64-x86_64-llvm-libs-16.05-1-any.tar.zstd): 107125k
Upstream after strip libLLVM-16.dll: 107103k
My small libLLVM-16.dll (after the build step of makepkg) : 94289k
My small libLLVM-16.dll after strip libLLVM-16.dll: 51829k

@rimrul
Copy link
Member

rimrul commented Jun 30, 2023

And I've got it to package. The binary extracted from the package is 51853k

rimrul added a commit to rimrul/MINGW-packages that referenced this issue Jun 30, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git ffor Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Jul 1, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git ffor Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Jul 1, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git ffor Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Jul 12, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git ffor Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Oct 11, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git for Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Oct 14, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git for Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
rimrul added a commit to rimrul/MINGW-packages that referenced this issue Oct 18, 2023
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git for Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
ammyk9 pushed a commit to ammyk9/MINGW-packages that referenced this issue Aug 8, 2024
Msys2 builds pretty versatile Clang and LLVM packages with LLVM backends
to cross compile for various target architectures. That comes at the
cost of big binaries, that challenge how we manage the Git for Windows
SDK[1]. But for Git for Windows we don't need all of this. We only need
a Clang that can compile ARM64 binaries on ARM64.

[1] git-for-windows/git-sdk-arm64#7

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants