-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing compiler_rt functions #1290
Comments
I'm also often getting these warnings, don't know what they mean: |
So far, I haven't tried to make sure we have all the compiler_rt functions. I've just been adding the ones that were missing when I personally tried something and got these errors. So I've never tried to make sure that we have all of them. So the first step to getting a permanent/robust solution to compiler_rt related problems is to go ahead and port all the rest of compiler_rt. I thought we had an issue open for that but I'm unable to find it, so this will be that issue. |
Thanks to @winksaville we now have these functions:
Next steps toward solving this issue is to compile a checklist of all the functions from llvm's compiler-rt builtins/ directory and then start porting them one by one. |
This list is derived from the following README file. https://raw.githubusercontent.com/llvm-mirror/compiler-rt/master/lib/builtins/README.txt There are some platform-specific functions that need to be added here, for Integral bit manipulation
Integral arithmetic
// Integral arithmetic with trapping overflow
Integral arithmetic which returns if overflow
Integral comparison:
Integral / floating point conversion
Floating point raised to integer power
Complex arithmeticFollowing are not required since we do not have language-level complex number support.
|
I had the same problem. It turns out you need to compile for gnueabi target in order to link using GCC ld. EABI names: This works for me:
|
@radek-senfeld we can add It's pretty easy to add compiler-rt functions. I can try to get those in today. Were there any other missing ones for you besides these 3? |
Hi Andrew, at first I want to thank you for your amazing project! I'm hooked to Zig! Well, my motivation was that I just had to know what is the problem and why it doesn't work. After spending quite a few hours digging around I now much better understand how things work under the hood. Well, there's apparently no issue with EABIs when Zig links the file. Running Which is quite strange because of ommited
After linking with compiler_rt.o everything I've tried has been sorted out. I just wanted to know if there's chance to have fp formatting in the firmware. Unfortunately the size of this feature is prohibitive (~100kB) at this moment. I've tried just a simple test:
There's one more issue but I'm not quite sure about the cause yet. It just hangs the MCU. I need to inspect it using a debugger. This causes MCU to hang:
Environment:
|
Thank you for the compliment and I'm happy that you like it.
Only if the symbols are called or used. Perhaps this is zig's lazy analysis of top level declarations? If you do not call a function then it does not get analyzed or included in the result. |
Ah that's interesting. Here we have a good use case for perhaps selecting a different implementation of floating point formatting when the Feel free to open a new issue which is dedicated to exploring your exact use case. We can comment back and forth there and perhaps learn some new Zig issues that need to be filed. |
You're probably right. I guess it's caused by me not specifying a linker script. Which means entry point isn't defined thus fn main() is not linked in and no symbols are missing because they are not used. |
Oh, my bad. Now I feel a bit stupid. When compiled using Floating-point formatting enabled:
Here is the summary:
|
this adds the following functions to compiler-rt: * `__mulsf3` * `__muldf3` * `__multf3` See #1290
The above commit adds:
|
Hi, everyone. Can we also get These two prevent the float formatting via
Note that because the imports have |
Thanks to @LemonBoy, |
Thanks, @LemonBoy and Andrew; |
Agreed! More optimized, well-tested compiler-rt implementations are welcome, and definitely within scope of the Zig project. |
…pti2 - use 2 if statements with 2 temporaries and a constant - tests: MIN, MIN+1, MIN/2, -1, 0, 1, MAX/2, MAX-1, MAX if applicable See ziglang#1290
- adds __cmpsi2, __cmpdi2, __cmpti2 - adds __ucmpsi2, __ucmpdi2, __ucmpti2 - use 2 if statements with 2 temporaries and a constant - tests: MIN, MIN+1, MIN/2, -1, 0, 1, MAX/2, MAX-1, MAX if applicable See ziglang#1290
- use negXi2.zig to prevent confusion with negXf2.zig - used for size optimized builds and machines without carry instruction - tests: special cases 0, -INT_MIN * use divTrunc range and shift with constant offsets See #1290
- use comptime instead of 2 identical implementations - tests: port missing tests and link to archived llvm-mirror release 80 See #1290
- adds __cmpsi2, __cmpdi2, __cmpti2 - adds __ucmpsi2, __ucmpdi2, __ucmpti2 - use 2 if statements with 2 temporaries and a constant - tests: MIN, MIN+1, MIN/2, -1, 0, 1, MAX/2, MAX-1, MAX if applicable See #1290
After manual inspection of the assmebly generated from popcount, the CPU simulator shows me ~5% performance penalty vs optimized assembly on x86_64 architectures, but for legal reasons I dont want to include and link the comparion here (128 bit popcount): link __popcountti2:
mov rax, rsi
shr rax
movabs r8, 6148914691236517205
and rax, r8
sub rsi, rax
movabs rax, 3689348814741910323
mov rcx, rsi
and rcx, rax
shr rsi, 2
and rsi, rax
add rsi, rcx
mov rcx, rsi
shr rcx, 4
add rcx, rsi
movabs r9, 1085102592571150095
and rcx, r9
movabs rdx, 72340172838076673
imul rcx, rdx
shr rcx, 56
mov rsi, rdi
shr rsi
and rsi, r8
sub rdi, rsi
mov rsi, rdi
and rsi, rax
shr rdi, 2
and rdi, rax
add rdi, rsi
mov rax, rdi
shr rax, 4
add rax, rdi
and rax, r9
imul rax, rdx
shr rax, 56
add eax, ecx
ret Generated on llvm-mca: Instructions: 3700 vs 3800 with |
- abs can only overflow, if a == MIN - comparing the sign change from wrapping addition is branchless - tests: MIN, MIN+1,..MIN+3, -42, -7, 0, 7.. See ziglang#1290
- abs can only overflow, if a == MIN - comparing the sign change from wrapping addition is branchless - tests: MIN, MIN+1,..MIN+3, -42, -7, 0, 7.. See ziglang#1290
- abs can only overflow, if a == MIN - comparing the sign change from wrapping addition is branchless - tests: MIN, MIN+1,..MIN+4, -42, -7, -1, 0, 1, 7.. See ziglang#1290
- abs can only overflow, if a == MIN - comparing the sign change from wrapping addition is branchless - tests: MIN, MIN+1,..MIN+4, -42, -7, -1, 0, 1, 7.. See #1290
- neg can only overflow, if a == MIN - case `-0` is properly handled by hardware, so overflow check by comparing `a == 0` is sufficient - tests: MIN, MIN+1, MIN+4, -42, -7, -1, 0, 1, 7.. See ziglang#1290
- neg can only overflow, if a == MIN - case `-0` is properly handled by hardware, so overflow check by comparing `a == MIN` is sufficient - tests: MIN, MIN+1, MIN+4, -42, -7, -1, 0, 1, 7.. See ziglang#1290
As I understand compiler_rt, it is for embedded devices without hardware capabilities, and thus space optimized. Unless there is a hardware routine. |
For Zig it actually has multiple purposes which can be determined by inspecting Our compiler-rt has the ability to choose different implementations depending on the desired mode. |
- neg can only overflow, if a == MIN - case `-0` is properly handled by hardware, so overflow check by comparing `a == MIN` is sufficient - tests: MIN, MIN+1, MIN+4, -42, -7, -1, 0, 1, 7.. See #1290
By the way, in https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html Andrew says:
Shouldn't the 1.0.0 milestone be referencing this issue then? |
@vladfaust The base stuff is implemented except the overflowing check primitives, which are semi-blocked by testing panics,
__addvsi3
__addvdi3
__addvti3
__subvsi3
__subvdi3
__subvti3
__mulvsi3
__mulvdi3
__mulvti3 see #1356 perf for mulo is ~17x faster than the llvm implementation (measured on my skylake laptop), which might make Zig compiler_rt be very worth to use in external code. mulo bench. I did not measure the Zig internal one yet. The perf gain for wrapping addition and subtraction instead of the simple approach are in the range of 10% for external code and the Zig internal one without pointers will have ~15% improvements for wrapping addition and subtraction. addo benches. So all in all, its almost finished: #1290 (comment) compiler_rt version 2.0 would then be to use the hw accelerations for the routines for all tier 1 targets and figure out how to track this in a sane way. |
Personally I would favor keeping compiler_rt small and readable and accept 5-8% inefficiency (very rough estimate extrapolating from the popcount speed difference) vs hand-rolled assembly.
On the side of panic testing to get this finished, I plan
EDIT1: progress, but needs benchmarking against realistic workloads to prevent regressions: #11701 |
Next step to solving this issue: we need an updated checklist of what is still missing. |
I just tried to compile 0.10 for
Edit: resolved by using gcc 12.0 which I had compiled from source instead of using gcc 10.2 provided by Void Linux. |
Suggestion to close this in favor of #15675. The tracking issues can be searched via the following patterns: |
I was playing around with the latest version of Zig (https://ci.appveyor.com/project/andrewrk/zig-d3l86/build/0.2.0+95f45cfc) on ARM Cortex-M0 and I had a lot of issues with missing compiler_rt functions.
It was missing '__aeabi_memcpy'
It was complaining about missing '__aeabi_uldivmod', which I fixed by commenting out
if (isArmArch()) {
in compiler_rt/index.zig .. later I found out that it also helped to use--target-arch armv6
instead ofthumb
, or addingbuiltin.Arch.thumb
toisArmArch
(so that should probably be fixed)After that it complained about
__aeabi_h2f
,__aeabi_f2h
and__multi3
This is the command I used
Using
fmt.bufPrint
is what is triggering these errors for me right now.Is it possible to get a more permanent/robust solution to compiler_rt related problems? Some kind of automated testing that everything is there somehow?
(Btw, good news is that linking with Zig (rather than GCC or LLD) seems to work just fine for me now)
The text was updated successfully, but these errors were encountered: