-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arm: add support for softfp in arm vfp assembly files #1221
Conversation
If ARM abi is not explicitly mentioned on the command line, then set the arm abi to softfp or hard according to the compiler environment. This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP accordingly.
Added generic 4x4 and 4x2 gemm kernels Added generic 4x2 trmm kernel
In case of softfp abi, assembly implementations of only those APIs are used which doesnt have a floating point argument or return value. In case of hard abi, all assembly implementations are used.
Since softfp code has been added to all required vfp kernels, the code for auto detection of abi is no longer required. The option to force softfp ABI on make command line by giving ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option anymore. Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed. These are longer required. Moreover these kernels has bugs.
This PR adds support for softfp in all arm vfp assembly kernel files. I have done only basic testing of this PR since I don't have an ARM machine at my disposal. All my testing were done on QEMU emulator. Request owners of following issues to help test this PR and see whether the issue is resolved or not (since all these issues looks related to softfp abi issues). #1088 #1145 #1168 #1206 #1208 #1215 #1145 @misbullah @gangm @odieXin @ctgushiwei @mariasmo @akira-miasato @gangm |
@tomaklutfu Thanks for reviewing. Yes. Those were wrong. I agree. Had corrected those in my local workspace. But still the GEMM answer was not correct. I didnt explore it much as these files were no longer required after all the changes were made in the assembly files. Please see that I have deleted those files in the last commit. |
@ashwinyes Thank you for the PR. Do I need to merge #1216 at first? |
No #1216 is not required. It can closed. I would prefer these changes to get tested by someone working on android and get a "Tested" stamp on it. I have only tested it on QEMU emulator. |
We will test it this week |
ashwinyes & xianyi, i am trying to understand this request. so i have been compiling openblas using the following command: make TARGET=ARMV7 HOSTCC=gcc CC=arm-linux-androideabi-gcc NOFORTRAN=1 NUM_THREADS=32 libs I believe it defaults to "hardfloat" right? my understanding is that the PR removes the "hardfloat" by default? and can i use the same above mentioned command or do i need to use the following: and to test, do can i get the latest from develop branch? or should i wait until the PR is merged? thanks |
Let me explain. softfp and hard float ABI differs in the way the arguments are passed on to functions and how values are returned. If your project has full C files only, then compiler will take care of everything. But issue is when you have hand written assembly code. The code in OpenBLAS was always assuming hard float ABI was used. So the code expects values to be certain registers according hard fp ABI which will not be the case when you use softfp ABI. Now coming to your question. arm-linux-androideabi-gcc (also arm-linux-gnueabi-gcc) by default is softfp ABI. When you use this compiler to compile OpenBLAS code, the issue I mentioned above arises. The PR doesn't remove anything. It just checks whether you are using a softfp or hard fp compiler and adjusts the arguments/return values accordingly. So, basically you don't need to give any additional make parameters to OpenBLAS with the PR. To test, you can use the branch https://github.com/ashwinyes/OpenBLAS/tree/develop_arm_softfp until this PR gets merged. Hope I cleared your doubt. If you want more information, you can go through ARM procedure calling standard document http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf |
ashwinyes thanks for the explanation. so prior to your changes OpenBLAS was assuming hard fp even though the default for arm-linux-androideabi-gcc is softfp. Am i correct in my understanding? and now with your changes , it will compile for only soffp and any other libraries i compile down the stream i shuould assume softfp and not hardfp, correct? thanks |
Yes. You are correct. |
thanks |
Tested with the following compilation command:
and issue #1168 seems solved with Android NDK r15b, with speedups comparable to what I achieved with the dirty adaptation from #1215/#1216. However, some changes regarding GCC/Clang support will be needed (NDK dropped GCC support). If this is a fast enough fix, the Clang issue is due to the following error when trying to cross-compile with NDK15b's clang50 with clang 4.0 as the host compiler:
Otherwise, I should open a new issue for those things, since this PR focuses on the softfp/hard ABI issues. |
"fnmacs s3 , s1, s4" is a valid ARM instruction. If clang doesn't recognize, then its more of a clang issue than OpenBLAS issue. But I will try to see if I can workaround it. |
Looks like clang is not recognizing pre-UAL mnemonics. We will have to replace with UAL mnemonics. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473c/CIHFAHFG.html |
@akira-miasato |
thanks a lot, sure enough, India's software industry has good superiority. |
No description provided.