-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for SGEMM_DIRECT Kernel based on SME1 #5084
base: develop
Are you sure you want to change the base?
Conversation
Thanks. Convenient that this special case does not imply debugging TRMM and SYMM as with the general case SGEMM kernel in #5011 that I still hope to get to soon. :/ |
HarmonyOS doesn't seem to support HWCAP either, and AppleClang balks at the "else if" introduced in common_s.h. I'll see if I can unravel and test locally. |
Sure, Thanks!. I am working on restructuring the code to HAVE_SME flag as per your suggestion. |
common_s.h
Outdated
@@ -213,9 +213,9 @@ | |||
#ifdef ARCH_X86_64 | |||
#define SGEMM_DIRECT_PERFORMANT gotoblas -> sgemm_direct_performant | |||
#define SGEMM_DIRECT gotoblas -> sgemm_direct | |||
#else | |||
#else if ARCH_ARM64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#elif ARCH_ARM64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Embarassingly it looks as if I nuked the HAVE_SME I had put in cpuid_arm64.c back in november (#4971) with some later change... bad things still sometimes happen when I jump between machines that are not always on the internet :( |
I have a question:
From the interface/gemm.c file, SGEMM_DIRECT kernel gets compiled only when DYNAMIC_ARCH=1. So when the library is compiled with DYNAMIC_ARCH=1 and if the TARGET is set to ARMV8 instead of AMRV9SME, does the function defined in kernel/arm64/sgemm_direct_arm64_sme1.c also be part of the library? I am assuming it won't be part of it as we have guarded the file with HAVE_SME. But then how can we ensure the library is supported on all Arm targets (Arm v8 , v9 etc)? |
With DYNAMIC_ARCH, TARGET is only used for the common code (interface/gemm.c and all the other interfaces, driver/level3 and so on), and the codes under kernel/arm64 are compiled in a loop with TARGET_CORE set to each of the individual models ARMV8, ARMV8SVE, etc. supported in DYNAMIC_ARCH configuration. So what I tried to express is that your kernel/arm64/sgemm_direct_arm64_sme1.c should look roughly like
so that the compiler finds something to compile (even if it is an empty function) whether it is running for a target with |
Got it. I will update the code and push the updated patch. Thanks! |
Umm, TARGET=ARMV9SME tells me you're already building on AymenQ's unmerged #5011 ? In that case I might refrain from putting back the HAVE_SME and we could just live with the TARGET name(s) like in the Skylakex sgemm_direct kernel. |
e875f09
to
3bce73c
Compare
Needs
in dynamic_arm64.c |
Also, kernel/setparam-ref.c will need a small expansion of the ifdef block referring to sgemm_direct - currently this has only the "ifdef x86_64" part, which is why DYNAMIC_ARCH fails to build on all arm64 platforms in CI. |
3bce73c
to
d3ef3a4
Compare
Hmm, seems that your "conflicting" changes in cmake/system.cmake have already been in the develop branch since early december (PR 5003), that part can probably be removed from this PR ? |
d3ef3a4
to
da50c80
Compare
Resolved merge conflicts in interface/gemm.c file. Also rebased cmake/system.cmake with develop branch. |
Can you please clarify if your PR is meant to depend on #5011 ? |
This PR is independent of #5011. This PR has SME 1 implementation for SGEMM_DIRECT kernel for targets that support SME 1 feature whereas the former has SME 2 implementation for SGEMM kernel. |
OK, thanks, that's what I originally assumed - but then there must be some missed bits about the ARMV9SME target it introduces. |
There are still few CI pipeline failures related mac OS and fortran. Can you please let me know how to resolve these failures?
|
That was why I was asking - seems the ARMV9SME dependencies for building the sgemm_direct kernels do not yet work in DYNAMIC_ARCH builds (at least). |
I got your point now. |
I think you should be able to open the logs by clicking on the respective "Details" link, except for the "Jenkins" CI (which runs on IBM Power and Zarch) that requires a concurrent login to the CI host for some reason. |
could you please check if it still builds for you with the NDK when you use this in place of kernel/Makefile.L3 ? |
Thanks for sharing the Makefile.L3 ! With this change, when But when we set
It looks like it is building the SME source files for ARMV8 target when we are setting Also, when we just set |
Sorry, made a silly mistake there - it needs to be |
21fdf9b
to
78a9b8f
Compare
Thanks, with this build is passing with |
9864754
to
bdc8cae
Compare
Getting closer.. failures appear to be from jobs that use gcc 11 (doesn't yet support sme in the |
btw I'm not sure I understand why your PR needs to disable SME on MacOS, is this not expected to work on the M4 ? |
Added the compiler check in c_check to check if SME is available. In cmake/arch.cmake, DYNAMIC_CORE, ARMV9SME target is already added if compiler is GCC >=14 or LLVM >= 19.
|
* Added ARMV9SME target * Added SGEMM_DIRECT kernel based on SME1
f5dcbf7
to
c1d80e6
Compare
This PR contains support for sgemm_direct kernel based on SME1 architecture.
sgemm_direct kernel handles a special case of cblas_sgemm() level 3 API where aplha =1 and beta=0.