Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MUSL Linux builds #37

Closed
tazz4843 opened this issue Oct 10, 2022 · 2 comments · Fixed by #576
Closed

MUSL Linux builds #37

tazz4843 opened this issue Oct 10, 2022 · 2 comments · Fixed by #576
Labels
build Build related issues good first issue Good for newcomers

Comments

@tazz4843
Copy link
Contributor

Hi there! I'm attempting to build whisper.cpp for MUSL Linux for some lightweight systems, and I figured I would note the issues I ran into during the build.

  1. Alpine appears to not include stdint.h, or alloca.h in its standard library when you install gcc. This results in a slew of errors:
localhost:~/whisper.cpp# make libwhisper.a
cc  -O3 -std=c11   -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -pthread   -c ggml.c
In file included from ggml.h:7,
                 from ggml.c:1:
/usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/include/stdint.h:9:26: error: no include path in which to search for stdint.h
    9 | # include_next <stdint.h>
      |                          ^
ggml.h:107:5: error: unknown type name 'int64_t'
  107 |     int64_t perf_cycles;
      |     ^~~~~~~
~~snip~~

ggml.c:6:10: fatal error: alloca.h: No such file or directory
    6 | #include <alloca.h>
      |          ^~~~~~~~~~
compilation terminated.
make: *** [Makefile:58: ggml.o] Error 1
localhost:~/whisper.cpp#

This fix is relatively simple, just install g++:

apk add g++
  1. clock_gettime and CLOCK_MONOTONIC are seemingly undefined regardless of compiler used.
localhost:~/whisper.cpp# make libwhisper.a
cc  -O3 -std=c11   -Wall -Wextra -Wno-unused-parameter -Wno-unused-function -pthread   -c ggml.c
ggml.c: In function 'ggml_time_ms':
ggml.c:155:5: warning: implicit declaration of function 'clock_gettime' [-Wimplicit-function-declaration]
  155 |     clock_gettime(CLOCK_MONOTONIC, &ts);
      |     ^~~~~~~~~~~~~
ggml.c:155:19: error: 'CLOCK_MONOTONIC' undeclared (first use in this function)
  155 |     clock_gettime(CLOCK_MONOTONIC, &ts);
      |                   ^~~~~~~~~~~~~~~
ggml.c:155:19: note: each undeclared identifier is reported only once for each function it appears in
ggml.c: In function 'ggml_time_us':
ggml.c:161:19: error: 'CLOCK_MONOTONIC' undeclared (first use in this function)
  161 |     clock_gettime(CLOCK_MONOTONIC, &ts);
      |                   ^~~~~~~~~~~~~~~
make: *** [Makefile:58: ggml.o] Error 1
localhost:~/whisper.cpp# 

Digging around the internet shows a fix for this as inserting #define _POSIX_C_SOURCE 199309L before including the time.h header. This appears to work successfully, placing it on line 10 of ggml.c. It would be nice if this issue could be fixed in some way. I would make a PR if I had sufficient knowledge to implement the required changes, which I don't.

@ggerganov ggerganov added good first issue Good for newcomers build Build related issues labels Oct 11, 2022
@earboxer
Copy link

Adding the flags

diff --git a/Makefile b/Makefile
index 8f8cbbe..42bba34 100644
--- a/Makefile
+++ b/Makefile
@@ -19,8 +19,8 @@ endif
 # Compile flags
 #

-CFLAGS   = -O3 -std=c11
-CXXFLAGS = -O3 -std=c++11
+CFLAGS   = -O3 -std=c11   -D_POSIX_SOURCE -D_GNU_SOURCE
+CXXFLAGS = -O3 -std=c++11 -D_POSIX_SOURCE -D_GNU_SOURCE
 LDFLAGS  =

 CFLAGS   += -Wall -Wextra -Wno-unused-parameter -Wno-unused-function

(for alpine, I scp'd the file over since I don't have bash installed).

On the PinePhone, the example runs in around 40 seconds with the tiny model.

whisper.cpp$ ./main -m models/ggml-tiny.en.bin  samples/jfk.wav
whisper_model_load: loading model from 'models/ggml-tiny.en.bin'
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 1
whisper_model_load: mem_required  = 390.00 MB
whisper_model_load: adding 1607 extra tokens
whisper_model_load: ggml ctx size =  84.99 MB
whisper_model_load: memory size =    11.41 MB
whisper_model_load: model size  =    73.54 MB

main: processing 'samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:07.740]   And so my fellow Americans ask not what your country can do for you
[00:00:07.740 --> 00:00:10.740]   ask what you can do for your country


whisper_print_timings:     load time =   722.46 ms
whisper_print_timings:      mel time =   876.13 ms
whisper_print_timings:   sample time =     0.00 ms
whisper_print_timings:   encode time = 39032.61 ms / 9758.15 ms per layer
whisper_print_timings:   decode time =  2668.16 ms / 667.04 ms per layer
whisper_print_timings:    total time = 43411.31 ms

@mikeslattery
Copy link

I don't see this in master (anymore?) In the meantime, here is another workaround that doesn't require code changes. Tested on alpine:

apk add g++ make sdl2-dev wget bash

make \
  -E 'CFLAGS += -D_POSIX_SOURCE -D_GNU_SOURCE' \
  -E 'CXXFLAGS += -D_POSIX_SOURCE -D_GNU_SOURCE'

mattsta pushed a commit to mattsta/whisper.cpp that referenced this issue Apr 1, 2023
Explicitly set the text encoding to UTF-8 in order to avoid UnicodeEncodeErrors

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Build related issues good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@ggerganov @mikeslattery @earboxer @tazz4843 and others