Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steps to produce working musl binaries? #32613

Closed
jwh opened this issue Nov 26, 2018 · 14 comments
Closed

Steps to produce working musl binaries? #32613

jwh opened this issue Nov 26, 2018 · 14 comments
Labels
A-build-system C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community

Comments

@jwh
Copy link

jwh commented Nov 26, 2018

Hi,

I'm trying to produce a working binary on Alpine Linux (edge), however regardless of the build flags the following occurs:

Program received signal SIGSEGV, Segmentation fault.
__pthread_mutex_lock (m=0x10b79) at src/thread/pthread_mutex_lock.c:7
7 src/thread/pthread_mutex_lock.c: No such file or directory.
(gdb) bt
#0 __pthread_mutex_lock (m=0x10b79) at src/thread/pthread_mutex_lock.c:7
#1 0x00005555579b9f98 in je_malloc_mutex_lock ()
#2 0x00005555579e615d in arena_malloc_small ()
#3 0x00005555579e6fec in je_arena_malloc_hard ()
#4 0x00005555578785db in malloc ()
#5 0x00007ffff7fbd827 in pthread_atfork (prepare=0x5555579b8253 <je_jemalloc_prefork>, parent=0x5555579b88f8 <je_jemalloc_postfork_parent>, child=0x5555579b8dff <je_jemalloc_postfork_child>) at src/thread/pthread_atfork.c:35
#6 0x000055555785c6af in malloc_init_hard_recursible ()
#7 0x000055555785c86e in malloc_init_hard ()
#8 0x000055555789dd68 in calloc ()
#9 0x00007ffff7fc6ab4 in __dls3 (sp=0x7fffffffeb80) at ldso/dynlink.c:1681
#10 0x00007ffff7fc6287 in __dls2 (base=, sp=0x7fffffffeb80) at ldso/dynlink.c:1459
#11 0x00007ffff7fc4051 in _dlstart () from /lib/ld-musl-x86_64.so.1
#12 0x0000000000000001 in ?? ()
#13 0x00007fffffffeda3 in ?? ()
#14 0x0000000000000000 in ?? ()

Are the steps to produce the binary at https://binaries.cockroachdb.com/cockroach-v2.1.1.linux-musl-amd64.tgz available anywhere?

Thanks

@benesch
Copy link
Contributor

benesch commented Nov 26, 2018

Those binaries are produced with build/builder.sh mkrelease linux-musl. You'll need Docker installed to run that command, plus a few spare GB of disk space. It's a large container.

IIRC jemalloc profiling, which we enable by default, is broken with musl. When you use mkrelease we can detect that you're building against musl and disable profiling. See here:

cd $(JEMALLOC_DIR) && $(JEMALLOC_SRC_DIR)/configure $(xconfigure-flags) $(if $(findstring musl,$(TARGET_TRIPLE)),,--enable-prof)

If you don't want to build through mkrelease, I suspect you could remove the --enable-prof flag from the Makefile manually and all would be well.

@benesch benesch self-assigned this Nov 26, 2018
@jwh
Copy link
Author

jwh commented Nov 26, 2018

Hmm, I saw the issue mentioned in the comments but that doesn't seem to be the problem as a fix was committed and that line won't add --enable-prof if 'musl' is in the triple... does it need explicitly disabling?

@benesch
Copy link
Contributor

benesch commented Nov 26, 2018

I don't think musl is typically in the target triple. That's sort of an oddity of our builder container. What does cc -dumpmachine say in your environment?

@jwh
Copy link
Author

jwh commented Nov 26, 2018

Well, in this particular case findstring is false, cc -dumpmachine returns 'x86_64-alpine-linux-musl' (and the build seems to get it right when just firing 'make build' (CFLAGS need to include -D_BSD_SOURCE also, maybe worth adding that to makefile?)

@benesch
Copy link
Contributor

benesch commented Nov 26, 2018

Well, in this particular case findstring is false, cc -dumpmachine returns 'x86_64-alpine-linux-musl' (and the build seems to get it right when just firing 'make build'

Wait, is findstring true or false? I would expect it to be true given that target triple and thus --enable-prof would be correctly omitted. In which case, you're right, it must be something else.

CFLAGS need to include -D_BSD_SOURCE also, maybe worth adding that to makefile?

Can you post the error you see without that define? If the error is in the jemalloc build, for example, then we'd want to teach jemalloc's build system to set the define.

Does the binary in
https://binaries.cockroachdb.com/cockroach-v2.1.1.linux-musl-amd64.tgz work correctly, or does that one segfault too?

@jwh
Copy link
Author

jwh commented Nov 26, 2018

Well, in this particular case findstring is false, cc -dumpmachine returns 'x86_64-alpine-linux-musl' (and the build seems to get it right when just firing 'make build'

Wait, is findstring true or false? I would expect it to be true given that target triple and thus --enable-prof would be correctly omitted. In which case, you're right, it must be something else.

It is correctly omitted (there was a commit that fixed this)

CFLAGS need to include -D_BSD_SOURCE also, maybe worth adding that to makefile?

Can you post the error you see without that define? If the error is in the jemalloc build, for example, then we'd want to teach jemalloc's build system to set the define.

There is just an instance of strlcat, makes sense given the library it's in:

go build -o cockroachoss -v  -tags ' make x86_64_alpine_linux_musl' -ldflags '-X github.com/cockroachdb/cockroach/pkg/build.typ=development -extldflags "" -X "github.com/cockroachdb/cockroach/pkg/build.tag=v2.1.1" -X "github.com/cockroachdb/cockroach/pkg/build.rev=4523f9819d2dde9b054fc8faf1568e107f130986" -X "github.com/cockroachdb/cockroach/pkg/build.cgoTargetTriple=x86_64-alpine-linux-musl" -X "github.com/cockroachdb/cockroach/pkg/build.channel=source-archive"  -X "github.com/cockroachdb/cockroach/pkg/build.utcTime=2018/11/26 22:44:16"' ./pkg/cmd/cockroach-oss
github.com/cockroachdb/cockroach/vendor/github.com/knz/go-libedit/unix
# github.com/cockroachdb/cockroach/vendor/github.com/knz/go-libedit/unix
In file included from /usr/include/fortify/stdio.h:25,
                 from vendor/github.com/knz/go-libedit/unix/src/c-libedit/sys.h:82,
                 from vendor/github.com/knz/go-libedit/unix/src/c-libedit/linux-build/config.h:281,
                 from vendor/github.com/knz/go-libedit/unix/src/c-libedit/chared.c:35,
                 from vendor/github.com/knz/go-libedit/unix/src/libedit-chared.c:1,
                 from wrap-chared.c:1:
/usr/include/fortify/string.h:158:1: error: 'strlcat' undeclared here (not in a function); did you mean 'strncat'?
 _FORTIFY_FN(strlcat) size_t strlcat(char *__d, const char *__s, size_t __n)
 ^~~~~~~~~~~
/usr/include/fortify/string.h:159: confused by earlier errors, bailing out

Does the binary in
https://binaries.cockroachdb.com/cockroach-v2.1.1.linux-musl-amd64.tgz work correctly, or does that one segfault too?

yup, the binary you guys build seems to at least start up correctly (haven't done any real testing as I've been trying to convince this to build properly so it can be packaged)

@benesch
Copy link
Contributor

benesch commented Nov 26, 2018

Oh, I see, it's yet another problem caused by Go's underpowered build system. HAVE_STRLCPY is unconditionally defined (https://github.com/knz/go-libedit/blob/a0dcee88fe03ca2297674a87786dbe0f2991e1a2/unix/src/c-libedit/sys.h#L89) when it really needs to be detected at compile time. Let me think a little bit about how to solve that.

The only other thing that comes to mind is whether you're actually linking the copy of jemalloc that you're building. Is there any chance you're linking against a system-provided jemalloc instead of the vendored copy? (This can happen if a -ljemalloc slips into your LDFLAGS.)

@jwh
Copy link
Author

jwh commented Nov 26, 2018

I double checked, the result is the same with or without a system jemalloc installed, that was my hunch as well....

@jwh
Copy link
Author

jwh commented Nov 26, 2018

Interestingly I just got the following in the same environment using the builder.sh method:

  /usr/bin/ld: dynamic STT_GNU_IFUNC symbol `strcmp' with pointer equality in
  `/usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../x86_64-linux-gnu/libc.a(strcmp.o)'
  can not be used when making an executable; recompile with -fPIE and relink
  with -pie

Using 2.1.1, same as native build

@benesch
Copy link
Contributor

benesch commented Nov 27, 2018 via email

@jwh
Copy link
Author

jwh commented Nov 27, 2018

build/builder.sh mkrelease linux-musl - just copy/pasted the one from your previous comment, didn't actually look up the proper invocation, was curious more than anything

@knz knz added C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community A-build-system labels Nov 27, 2018
@benesch
Copy link
Contributor

benesch commented Dec 3, 2018

Hmm, that should have worked just fine, but the fact that that error refers to x86_64-linux-gnu indicates that something has certainly gone wrong. If you post the full transcript of your terminal session I might be able to diagnose.

FYI #32623 will incidentally fix the BSD_SOURCE problem.

@knz knz unassigned benesch Jan 13, 2019
@jwh
Copy link
Author

jwh commented Mar 6, 2019

I've recently had time to revisit this and I stumbled upon an issue for building on OpenBSD - TAGS=stdmalloc was mentioned and that allows it to build and at least run.

I'm not sure if I'll have time to look into raising a PR with fixes in to build with stdmalloc on musl, but it seems like that would be the sensible thing to do.

I haven't checked the existing 2.1.5 release, but I did build and runtime test with v19.1.0-beta.20190304, issues below are present in that version.

There are a couple of other issues that are still outstanding however:

  • _BSD_SOURCE isn't defined yet
  • Building on 32bit encounters failures as the code in rocksdb assumes 64bit integers

Both seem like trivial fixes so hopefully someone can get those in...

Otherwise thanks for the awesome software :)

@jlinder
Copy link
Collaborator

jlinder commented Jun 30, 2020

Hi. We've recently removed musl support from the build in commit 4a307bc . Closing this as it's a related issue.

@jlinder jlinder closed this as completed Jun 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-system C-investigation Further steps needed to qualify. C-label will change. O-community Originated from the community
Projects
None yet
Development

No branches or pull requests

4 participants