Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-rt][AArch64] Rewrite SME routines to all use __aarch64_cpu_features. #119414

Merged

Conversation

sdesmalen-arm
Copy link
Collaborator

When #92921 added the __arm_get_current_vg functionality, it used the FMV feature bits mechanism rather than the mechanism that was previously added for SME which called getauxval on Linux platforms or __aarch64_sme_accessible required for baremetal libraries. It is better to always use __aarch64_cpu_features.

For baremetal we still need to rely on __arm_sme_accessible to initialise the struct.

Copy link

github-actions bot commented Dec 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@sdesmalen-arm sdesmalen-arm force-pushed the compiler-rt-unify-sme-feature-detection branch from 8db75c0 to ff0bbab Compare December 10, 2024 16:59
When llvm#92921 added the `__arm_get_current_vg` functionality, it used
the FMV feature bits mechanism rather than the existing mechanism that was
previously added for SME that called `getauxval` (on Linux platforms)
or `__aarch64_sme_accessible` (required for baremetal libraries).

It seems simpler to always use the FMV feature bits mechanism, but
for baremetal targets we still need to rely on `__arm_sme_accessible`.
@sdesmalen-arm sdesmalen-arm force-pushed the compiler-rt-unify-sme-feature-detection branch from ff0bbab to 450ed8a Compare December 10, 2024 17:07
@DanielKristofKiss
Copy link
Member

Maybe the compiler-rt/lib/builtins/cpu_model/aarch64/fmv/baremetal_sme.inc could drop the SME from the filename and the macro name as it could do more later.

LGTM otherwise.

@jroelofs jroelofs requested a review from aemerson December 10, 2024 18:22
@jroelofs
Copy link
Contributor

I'm not sure we want this on Darwin and maybe Fuchsia platforms, since it would require calling __init_cpu_features_resolver to initialize __aarch64_cpu_features. On some platforms that happens automatically via an __attribute__((constructor)) function, but Darwin and Fuchsia don't, and instead rely on an FMV resolver triggering it.

cc @aemerson

@jroelofs jroelofs requested a review from petrhosek December 10, 2024 18:29
@aemerson
Copy link
Contributor

I'm not sure we want this on Darwin and maybe Fuchsia platforms, since it would require calling __init_cpu_features_resolver to initialize __aarch64_cpu_features. On some platforms that happens automatically via an __attribute__((constructor)) function, but Darwin and Fuchsia don't, and instead rely on an FMV resolver triggering it.

cc @aemerson

There's no immediate concern for Darwin because we have our own parallel implementations of the SME runtime routines.

@sdesmalen-arm
Copy link
Collaborator Author

On some platforms that happens automatically via an __attribute__((constructor)) function, but Darwin and Fuchsia don't, and instead rely on an FMV resolver triggering it.

Is there a technical reason for that or is this just an implementation choice?

@sdesmalen-arm sdesmalen-arm merged commit cb4f4a8 into llvm:main Dec 11, 2024
7 checks passed
@jroelofs
Copy link
Contributor

On some platforms that happens automatically via an __attribute__((constructor)) function, but Darwin and Fuchsia don't, and instead rely on an FMV resolver triggering it.

Is there a technical reason for that or is this just an implementation choice?

All the existing uses of it on Darwin platforms are lazy, and thus you don't pay for what you don't use. And in general, we try really hard to avoid global ctors/dtors as they tend to dirty pages, increase launch times, etc.

@jroelofs
Copy link
Contributor

There's no immediate concern for Darwin because we have our own parallel implementations of the SME runtime routines.

Well, we have a downstream one. I think we should avoid making the upstream one incorrect and/or slower than it needs to be.

sdesmalen-arm added a commit to sdesmalen-arm/llvm-project that referenced this pull request Dec 12, 2024
…nitialised.

According to the conversation
[here](llvm#119414 (comment)),
some platforms don't enable `__arm_cpu_features` with a global constructor,
but rather do so lazily when called from the FMV resolver.

PR llvm#119414 removed the CMake guard to check to see if the targetted
platform is baremetal or supports sys/auxv. Without this check, the
routines rely on `__arm_cpu_features` being initialised when they
may not be, depending on the platform.

This PR simply avoids building the SME routines for those platforms for now.
sdesmalen-arm added a commit that referenced this pull request Dec 12, 2024
…nitialised. (#119703)

According to the conversation

[here](#119414 (comment)),
some platforms don't enable `__arm_cpu_features` with a global
constructor, but rather do so lazily when called from the FMV resolver.

PR #119414 removed the CMake guard to check to see if the targetted
platform is baremetal or supports sys/auxv. Without this check, the
routines rely on `__arm_cpu_features` being initialised when they may
not be, depending on the platform.

This PR simply avoids building the SME routines for those platforms for
now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants