Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT] implement Interlocked.ReadMemoryBarrier intrinsic #86842

Merged
merged 10 commits into from
May 31, 2023

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented May 28, 2023

Fixes: #84445

@ghost
Copy link

ghost commented May 28, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes: #84445

Author: VSadov
Assignees: -
Labels:

area-NativeAOT-coreclr

Milestone: -

@VSadov
Copy link
Member Author

VSadov commented May 28, 2023

/azp run runtime-extra-platforms

@dotnet dotnet deleted a comment from azure-pipelines bot May 28, 2023
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented May 28, 2023

All the failures seem to be known failures.

@VSadov VSadov requested a review from jkotas May 28, 2023 16:35
@VSadov
Copy link
Member Author

VSadov commented May 28, 2023

There was nothing needed to actually make JIT emit the instruction intrinsically. Since it already does for CoreCLR, it "just worked"

    0xaaaaaab87efc <+188>: cmp    w4, #0x8                  ; =0x8
    0xaaaaaab87f00 <+192>: b.lt   0xaaaaaab87eb4            ; <+116> at TypeCast.cs:1083
    0xaaaaaab87f04 <+196>: b      0xaaaaaab87f1c            ; <+220> at TypeCast.cs:1083
>>    0xaaaaaab87f08 <+200>: dmb    ishld
    0xaaaaaab87f0c <+204>: ldr    w1, [x5]
    0xaaaaaab87f10 <+208>: cmp    w6, w1

@@ -204,6 +204,14 @@ public static void MemoryBarrier()
}
#endregion

#region ReadMemoryBarrier
[Intrinsic]
internal static void ReadMemoryBarrier()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intrinsic methods can be implemented by recursive expansion like this:

[Intrinsic]
internal static void ReadMemoryBarrier();

It saves you from all plumbing for FCalls, implementing it in the unmanaged PAL. Also, it guarantees that the intrinsic expansion matches the non-intrinsic implementation.

It would be better to use this here. I think you can redo both MemoryBarrier and ReadMemoryBarrier, for both naot and coreclr like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this is the logic in the JIT that takes care of it:

// The recursive non-virtual calls to Jit intrinsics are must-expand by convention.
mustExpand = gtIsRecursiveCall(method) && !(methodFlags & CORINFO_FLG_VIRTUAL);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried putting failing implementations and it looks like Debug can reach them. Are we doing intrinsic substitution differently in Debug?

Copy link
Member Author

@VSadov VSadov May 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see - if recursive -> must always do intrinsic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you should be able to move these to src\libraries\System.Private.CoreLib\src\System\Threading\Interlocked.cs. It would be nice to add a test that creates a delegate pointing to MemoryBarrier, and calls the delegate - to make sure that the recursive intrinsic expansion is kicking in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can redo both MemoryBarrier and ReadMemoryBarrier, for both naot and coreclr like this.

And nonobject CompareExchange, I suppose.
I remember I did not like something about the asm implementation. It is hard to make it right - make sure it is a full fence, uses 8.1 atomics where available. It’d be better to just leave it to the compiler to emit the best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And nonobject CompareExchange, I suppose.

We do not have CompareExchange intrinsic expansion implemented for all platforms in RyuJIT. It would have to be under ifdefs that must be in sync with what's implemented in the JIT. I would leave it for separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've switched MemoryBarrier and ReadMemoryBarrier to be self-referring intrinsics.
PalMemoryBarrier in NativeAOT is still needed as we call it from native code, but otherwise a lot of code was removed.

@VSadov VSadov requested a review from marek-safar as a code owner May 30, 2023 17:46
Comment on lines 606 to 607
static Action mb;
static delegate* <void> pMb;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static state in tests is not pretty. What is this trying to achieve?

If you are trying to avoid optimizations from kicking in, you can create a noinlineable method that returns the delegate or function pointer.

Copy link
Member Author

@VSadov VSadov May 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can create a noinlineable method that returns the delegate or function pointer.

yes, that would be better.

static Action mb;
static delegate* <void> pMb;

[ConditionalFact(typeof(PlatformDetection), nameof(PlatformDetection.IsThreadingSupported))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to be under IsThreadingSupported? I would expect MemoryBarrier to be no-op in single threaded modes.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@VSadov
Copy link
Member Author

VSadov commented May 31, 2023

@vargaz @lambdageek - there is a small mono change here. Looks ok?

@vargaz
Copy link
Contributor

vargaz commented May 31, 2023

Looks ok. In theory, all mono execution engines treat this as an intrinsics.

@VSadov
Copy link
Member Author

VSadov commented May 31, 2023

Thanks!!

@VSadov VSadov merged commit fbf1f72 into dotnet:main May 31, 2023
@VSadov VSadov deleted the rf branch May 31, 2023 14:24
@ghost ghost locked as resolved and limited conversation to collaborators Jun 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NativeAOT] Implement Interlocked.ReadMemoryBarrier()
3 participants