Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: add support for softfp in arm vfp assembly files #1221

Merged
merged 17 commits into from
Jul 10, 2017

Conversation

ashwinyes
Copy link
Contributor

No description provided.

Ashwin Sekhar T K added 16 commits June 30, 2017 18:20
If ARM abi is not explicitly mentioned on the command line, then set the
arm abi to softfp or hard according to the compiler environment.
This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP
accordingly.
Added generic 4x4 and 4x2 gemm kernels
Added generic 4x2 trmm kernel
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.

In case of hard abi, all assembly implementations are used.
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.

The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.

Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
@ashwinyes
Copy link
Contributor Author

ashwinyes commented Jul 1, 2017

This PR adds support for softfp in all arm vfp assembly kernel files.

I have done only basic testing of this PR since I don't have an ARM machine at my disposal. All my testing were done on QEMU emulator.

Request owners of following issues to help test this PR and see whether the issue is resolved or not (since all these issues looks related to softfp abi issues).

#1088 #1145 #1168 #1206 #1208 #1215 #1145 @misbullah @gangm @odieXin @ctgushiwei @mariasmo @akira-miasato @gangm

@ashwinyes
Copy link
Contributor Author

@tomaklutfu Thanks for reviewing.

Yes. Those were wrong. I agree. Had corrected those in my local workspace. But still the GEMM answer was not correct. I didnt explore it much as these files were no longer required after all the changes were made in the assembly files.

Please see that I have deleted those files in the last commit.

@xianyi
Copy link
Collaborator

xianyi commented Jul 3, 2017

@ashwinyes Thank you for the PR.

Do I need to merge #1216 at first?

@ashwinyes
Copy link
Contributor Author

No #1216 is not required. It can closed.

I would prefer these changes to get tested by someone working on android and get a "Tested" stamp on it. I have only tested it on QEMU emulator.

@xianyi
Copy link
Collaborator

xianyi commented Jul 3, 2017

We will test it this week

@xianyi xianyi mentioned this pull request Jul 3, 2017
2 tasks
@mariasmo
Copy link

mariasmo commented Jul 3, 2017

ashwinyes & xianyi,

i am trying to understand this request.

so i have been compiling openblas using the following command:

make TARGET=ARMV7 HOSTCC=gcc CC=arm-linux-androideabi-gcc NOFORTRAN=1 NUM_THREADS=32 libs

I believe it defaults to "hardfloat" right?

my understanding is that the PR removes the "hardfloat" by default? and can i use the same above mentioned command or do i need to use the following:
ARM_SOFTFP_ABI=1

and to test, do can i get the latest from develop branch? or should i wait until the PR is merged?

thanks

@ashwinyes
Copy link
Contributor Author

ashwinyes commented Jul 3, 2017

Let me explain.

softfp and hard float ABI differs in the way the arguments are passed on to functions and how values are returned. If your project has full C files only, then compiler will take care of everything.

But issue is when you have hand written assembly code. The code in OpenBLAS was always assuming hard float ABI was used. So the code expects values to be certain registers according hard fp ABI which will not be the case when you use softfp ABI.

Now coming to your question. arm-linux-androideabi-gcc (also arm-linux-gnueabi-gcc) by default is softfp ABI. When you use this compiler to compile OpenBLAS code, the issue I mentioned above arises.

The PR doesn't remove anything. It just checks whether you are using a softfp or hard fp compiler and adjusts the arguments/return values accordingly.

So, basically you don't need to give any additional make parameters to OpenBLAS with the PR.

To test, you can use the branch https://github.com/ashwinyes/OpenBLAS/tree/develop_arm_softfp until this PR gets merged.

Hope I cleared your doubt.

If you want more information, you can go through ARM procedure calling standard document http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf

@mariasmo
Copy link

mariasmo commented Jul 3, 2017

ashwinyes

thanks for the explanation.

so prior to your changes OpenBLAS was assuming hard fp even though the default for arm-linux-androideabi-gcc is softfp. Am i correct in my understanding?

and now with your changes , it will compile for only soffp and any other libraries i compile down the stream i shuould assume softfp and not hardfp, correct?

thanks

@ashwinyes
Copy link
Contributor Author

Yes. You are correct.

@mariasmo
Copy link

mariasmo commented Jul 3, 2017

thanks

@amiasato-zz
Copy link

I'll be checking it today regarding #1168 and #1215.

@amiasato-zz
Copy link

amiasato-zz commented Jul 4, 2017

Tested with the following compilation command:

make TARGET=ARMV7 HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_SHARED=1 NOFORTRAN=1 USE_THREAD=0 NUM_THREADS=32 ARM_SOFTFP_ABI=1 libs

and issue #1168 seems solved with Android NDK r15b, with speedups comparable to what I achieved with the dirty adaptation from #1215/#1216.

However, some changes regarding GCC/Clang support will be needed (NDK dropped GCC support). If this is a fast enough fix, the Clang issue is due to the following error when trying to cross-compile with NDK15b's clang50 with clang 4.0 as the host compiler:

<instantiation>:34:2: error: invalid instruction
 fnmacs s3 , s1, s4

Otherwise, I should open a new issue for those things, since this PR focuses on the softfp/hard ABI issues.

@ashwinyes
Copy link
Contributor Author

"fnmacs s3 , s1, s4" is a valid ARM instruction. If clang doesn't recognize, then its more of a clang issue than OpenBLAS issue. But I will try to see if I can workaround it.

@ashwinyes
Copy link
Contributor Author

Looks like clang is not recognizing pre-UAL mnemonics. We will have to replace with UAL mnemonics.

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473c/CIHFAHFG.html

@ashwinyes
Copy link
Contributor Author

@akira-miasato
Addressed the clang issue in #1226

@jxllh123456
Copy link

thanks a lot, sure enough, India's software industry has good superiority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants