feat(arm64): enable single-binary builds #2490

mudler · 2024-06-05T07:19:50Z

Description

This PR tries to add a job to build on aarch64 by cross-compiling.

In the current state, it only builds the llama.cpp backend (non-cuda).

I've tried compiling other backends and cuda but didn't had success yet.I'm leaving the CI configured for installing CUDA on the host, however that needs another iteration to figure out what's still wrong with it.

As a note, I've also added in the startup logics to allow to ship libraries as part of the backend assets and load those during start, under the lib directory, for instance, sibiling of the current directory we use to extract the grpc backends.

In my tests with the binary it needed other libraries as well (which aren't part of this PR) in order to start llama.cpp correctly. This is not addressed in this PR, but I will likely follow up to leverage the lib loading to ship libraries along with the binary as well.

netlify · 2024-06-05T07:20:05Z

✅ Deploy Preview for localai canceled.

Name	Link
🔨 Latest commit	`ad3fe0a`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/66656de39a6804000885103c

sozercan · 2024-06-05T10:15:38Z

@mudler looks like this needs cuda packages to be installed

another option i was thinking is that merge release into container builds (using qemu for arm64 builds), and have an optional container target that just copies the binaries into scratch, then output them with --output. however, i think the container builds take really long so this will take quite some time unless this can be cached.

mudler · 2024-06-05T13:40:11Z

@mudler looks like this needs cuda packages to be installed

another option i was thinking is that merge release into container builds (using qemu for arm64 builds), and have an optional container target that just copies the binaries into scratch, then output them with --output. however, i think the container builds take really long so this will take quite some time unless this can be cached.

yep that was actually what I had in mind as well.. but thought was worth giving a shot in any case as with cross-arch should supposedly be faster 🤞

however, as we build already an arm64 container maybe at the end that's what we have to end up doing to squeeze CI cycles

sozercan · 2024-06-05T15:51:56Z

.github/workflows/release.yaml

+          sudo apt-get update
+          sudo apt-get install -y cuda-cross-aarch64
+        env:
+          CUDA_VERSION: 12-3


nvidia isn't publishing cuda libraries for arm64 for 12.3 for some reason, this need to be 12-4 or 12-5. I am guessing runtime libraries needs to match that too

we can start to think also to bundle that as part of the binary and then load libraries from there in runtime, that way we skip this annoying bit for users who are going to use static binaries

some of that logic implemented in 4412036 - it should allow us to carry over libs. I've just tried the arm64 builds and "almost" works in my env. llama.cpp doesn't start as there is a libstdc++ incompatibility with the host

mudler · 2024-06-05T17:08:45Z

something else is off - probably the cross arch compiler isn't even picked up properly

mudler · 2024-06-05T17:41:30Z

long time I didn't do this - bare with me a bit. @sozercan any input is helpful as CI cycles takes a while

mudler · 2024-06-08T10:39:11Z

ok let's start step by step and have first an arm64 binary cross-compiled, will see what to do next with cuda - seems it's not being picked up correctly

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

pkg/assets/extract.go

+			} else {
+				ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, libDir)
+			}
+			os.Setenv("LD_LIBRARY_PATH", ldLibraryPath)


mudler · 2024-06-08T21:53:58Z

aarch64 binary is compiled successfully for the aarch64, however in my test failed:

/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback 
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: /lib/aarch64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.13' not found (required by /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback)

file /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback 
/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=dcff3a71166550acb940d410870543505d17aa94, for GNU/Linux 3.7.0, with debug_info, not stripped

so we are almost there. I'm trying to get a baseline now and merge what we have, we can iterate from there to add more backends to it

mudler · 2024-06-08T21:57:38Z

I want to try next to embed the libs (like nvidia too) into the final binary. will take a while to test, but if that works then we don't have to worry much more also in that specific case. If that won't work I could take it a little bit further in trying by pivoting root (from the user).

Anyhow, leaving libs handling for a follow-up, polishing now.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the arm_builds branch from 2425d93 to 5b79dfb Compare June 5, 2024 15:50

sozercan reviewed Jun 5, 2024

View reviewed changes

mudler force-pushed the arm_builds branch from 5b79dfb to 2ada0c7 Compare June 5, 2024 16:23

mudler force-pushed the arm_builds branch 5 times, most recently from 1ea6701 to 9124bb3 Compare June 7, 2024 22:27

mudler added 10 commits June 8, 2024 13:06

ci: try to build for arm64

51885e1

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Allow to skip hipblas on make dist

c23856f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

use arm64 cross compiler

de8aa5f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

correctly target go arm64

8ca777f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

create a separate target

b7418a2

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

cross-compile grpc

17c2fc2

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Add Protobuf include dirs

b195ad1

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

temp disable CUDA build

a27e13e

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

aarch64 builds: Reduce backends

4c2b85f

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Even less backends

5367a9b

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the arm_builds branch from 4227eb6 to 5367a9b Compare June 8, 2024 11:07

mudler added 2 commits June 8, 2024 14:13

Even less backends

ee6e565

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

feat(startup): allow to load libs from extracted assets

4412036

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

github-advanced-security bot found potential problems Jun 8, 2024

View reviewed changes

pkg/assets/extract.go

} else {

ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, libDir)

}

os.Setenv("LD_LIBRARY_PATH", ldLibraryPath)

Check warning

Code scanning / Golang security checks by gosec

Errors unhandled. Warning

Errors unhandled.

mudler changed the title ~~ci: try to build for arm64~~ feat(arm64): enable single-binary builds Jun 8, 2024

mudler marked this pull request as ready for review June 8, 2024 22:02

mudler force-pushed the arm_builds branch from be1d69b to df276b9 Compare June 9, 2024 08:40

makefile: set arch

ad3fe0a

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the arm_builds branch from df276b9 to ad3fe0a Compare June 9, 2024 08:54

mudler merged commit 6c087ae into master Jun 9, 2024
38 checks passed

mudler deleted the arm_builds branch June 9, 2024 13:11

mudler added the enhancement New feature or request label Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(arm64): enable single-binary builds #2490

feat(arm64): enable single-binary builds #2490

mudler commented Jun 5, 2024 •

edited

Loading

netlify bot commented Jun 5, 2024 •

edited

Loading

sozercan commented Jun 5, 2024 •

edited

Loading

mudler commented Jun 5, 2024

sozercan Jun 5, 2024 •

edited

Loading

mudler Jun 5, 2024

mudler Jun 8, 2024

mudler commented Jun 5, 2024

mudler commented Jun 5, 2024

mudler commented Jun 8, 2024

mudler commented Jun 8, 2024

mudler commented Jun 8, 2024

feat(arm64): enable single-binary builds #2490

feat(arm64): enable single-binary builds #2490

Conversation

mudler commented Jun 5, 2024 • edited Loading

netlify bot commented Jun 5, 2024 • edited Loading

✅ Deploy Preview for localai canceled.

sozercan commented Jun 5, 2024 • edited Loading

mudler commented Jun 5, 2024

sozercan Jun 5, 2024 • edited Loading

Choose a reason for hiding this comment

mudler Jun 5, 2024

Choose a reason for hiding this comment

mudler Jun 8, 2024

Choose a reason for hiding this comment

mudler commented Jun 5, 2024

mudler commented Jun 5, 2024

mudler commented Jun 8, 2024

mudler commented Jun 8, 2024

mudler commented Jun 8, 2024

mudler commented Jun 5, 2024 •

edited

Loading

netlify bot commented Jun 5, 2024 •

edited

Loading

sozercan commented Jun 5, 2024 •

edited

Loading

sozercan Jun 5, 2024 •

edited

Loading