Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify core::hint::spin_loop #115547

Merged
merged 1 commit into from
Sep 18, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 17 additions & 24 deletions library/core/src/hint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,34 +175,27 @@ pub fn spin_loop() {
unsafe { crate::arch::x86_64::_mm_pause() };
}

// RISC-V platform spin loop hint implementation
#[cfg(target_arch = "riscv32")]
{
// RISC-V RV32 and RV64 share the same PAUSE instruction, but they are located in different
// modules in `core::arch`.
// In this case, here we call `pause` function in each core arch module.
#[cfg(target_arch = "riscv32")]
{
crate::arch::riscv32::pause();
}
#[cfg(target_arch = "riscv64")]
{
crate::arch::riscv64::pause();
}
crate::arch::riscv32::pause();
}

#[cfg(any(target_arch = "aarch64", all(target_arch = "arm", target_feature = "v6")))]
#[cfg(target_arch = "riscv64")]
{
#[cfg(target_arch = "aarch64")]
{
// SAFETY: the `cfg` attr ensures that we only execute this on aarch64 targets.
unsafe { crate::arch::aarch64::__isb(crate::arch::aarch64::SY) };
}
#[cfg(target_arch = "arm")]
{
// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
// with support for the v6 feature.
unsafe { crate::arch::arm::__yield() };
}
crate::arch::riscv64::pause();
}

#[cfg(target_arch = "aarch64")]
{
// SAFETY: the `cfg` attr ensures that we only execute this on aarch64 targets.
unsafe { crate::arch::aarch64::__isb(crate::arch::aarch64::SY) };
}

#[cfg(all(target_arch = "arm", target_feature = "v6"))]
{
// SAFETY: the `cfg` attr ensures that we only execute this on arm targets
// with support for the v6 feature.
Comment on lines +196 to +197
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure those (or any of the above) comments make sense. As far as I can tell, all the functions in core::arch::... are cfg-ed anyway, so we would not be able to call them on a wrong arch. Should I delete them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions in core::arch::... can be called just fine if the relevant cfg(target_feature = "...") is not true at compile time. In that case you need to do runtime detection to check if it is supported before calling however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this could be true.

__yield is defined in arm_shared. arm_shared is marked as #[cfg(any(target_arch = "arm", target_arch = "aarch64", doc))]. __yield itself is marked as #[cfg(any(target_feature = "v6", target_arch = "aarch64", doc))].

So this results in __yield being available if cfg(all(any(target_arch = "arm", target_arch = "aarch64", doc), any(target_feature = "v6", target_arch = "aarch64", doc))). doc and target_arch = "aarch64" are not true in this context, so we are left with cfg(all(any(target_arch = "arm"), any(target_feature = "v6"))), which can be simplified to cfg(all(target_arch = "arm", target_feature = "v6")).

Unless I'm missing something, this is exactly the same as the cfg on this block?...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you are right in this case. For x86 intrinsics this is definitively not the case though. Those are never cfged on target features.

unsafe { crate::arch::arm::__yield() };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answer on SO seems to suggest that __isb(SY) is better than __yield() (for some definition of "better" I guess). It seems like both of those are available for both aarch64 and arm. Should we use __isb(SY) for arm too?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SO answer is mistaken. At least, somewhat mistaken. ISB, the "Instruction Synchronization Barrier," flushes the instruction pipeline on the calling processor. It isn't explicitly for use in something as general-purpose as a spin-loop:

Instruction Synchronization Barrier flushes the pipeline in the PE and is a context synchronization event.

When used with the SY option it can cause a "full system barrier operation," which could only be more expensive than plain ISB.

I worry that __isb(SY) may only appear to improve performance, and only on select implementations.

The instruction that best fits the intent of a spin-loop hint is YIELD:

YIELD is a hint instruction. Software with a multithreading capability can use a YIELD instruction to indicate to the PE that it is performing a task, for example a spin-lock, that could be swapped out to improve overall system performance. The PE can use this hint to suspend and resume multiple software threads if it supports the capability.

Expanding the use of __isb(SY) from aarch64 to both targets seems misguided. If anything, the change from __yield() to __isb(SY) on aarch64 should be reverted.

}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side-note, I've noticed that hexagon target has a pause instruction tailored for spin loops, but asm! macro doesn't seem to support hexagon yet, so I'm not sure if we can do anything here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that we actually do support hexagon ASM. It's just unstable and not documented (besides the tracking issue: #93335). I could try adding a pause here for it then.


Expand Down