-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-rt][AArch64] Rewrite SME routines to all use __aarch64_cpu_features. #119414
[compiler-rt][AArch64] Rewrite SME routines to all use __aarch64_cpu_features. #119414
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
8db75c0
to
ff0bbab
Compare
When llvm#92921 added the `__arm_get_current_vg` functionality, it used the FMV feature bits mechanism rather than the existing mechanism that was previously added for SME that called `getauxval` (on Linux platforms) or `__aarch64_sme_accessible` (required for baremetal libraries). It seems simpler to always use the FMV feature bits mechanism, but for baremetal targets we still need to rely on `__arm_sme_accessible`.
ff0bbab
to
450ed8a
Compare
Maybe the LGTM otherwise. |
I'm not sure we want this on Darwin and maybe Fuchsia platforms, since it would require calling cc @aemerson |
There's no immediate concern for Darwin because we have our own parallel implementations of the SME runtime routines. |
Is there a technical reason for that or is this just an implementation choice? |
All the existing uses of it on Darwin platforms are lazy, and thus you don't pay for what you don't use. And in general, we try really hard to avoid global ctors/dtors as they tend to dirty pages, increase launch times, etc. |
Well, we have a downstream one. I think we should avoid making the upstream one incorrect and/or slower than it needs to be. |
…nitialised. According to the conversation [here](llvm#119414 (comment)), some platforms don't enable `__arm_cpu_features` with a global constructor, but rather do so lazily when called from the FMV resolver. PR llvm#119414 removed the CMake guard to check to see if the targetted platform is baremetal or supports sys/auxv. Without this check, the routines rely on `__arm_cpu_features` being initialised when they may not be, depending on the platform. This PR simply avoids building the SME routines for those platforms for now.
…nitialised. (#119703) According to the conversation [here](#119414 (comment)), some platforms don't enable `__arm_cpu_features` with a global constructor, but rather do so lazily when called from the FMV resolver. PR #119414 removed the CMake guard to check to see if the targetted platform is baremetal or supports sys/auxv. Without this check, the routines rely on `__arm_cpu_features` being initialised when they may not be, depending on the platform. This PR simply avoids building the SME routines for those platforms for now.
When #92921 added the
__arm_get_current_vg
functionality, it used the FMV feature bits mechanism rather than the mechanism that was previously added for SME which calledgetauxval
on Linux platforms or__aarch64_sme_accessible
required for baremetal libraries. It is better to always use__aarch64_cpu_features
.For baremetal we still need to rely on
__arm_sme_accessible
to initialise the struct.