Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MULX returning ValueTuple #61336

Closed
wants to merge 2 commits into from
Closed

Support MULX returning ValueTuple #61336

wants to merge 2 commits into from

Conversation

pentp
Copy link
Contributor

@pentp pentp commented Nov 9, 2021

Based on #37928, fixes #11782.

@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels Nov 9, 2021
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Nov 9, 2021
@ghost
Copy link

ghost commented Nov 9, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Based on #37928, fixes #11782.

Author: pentp
Assignees: -
Labels:

area-CodeGen-coreclr, new-api-needs-documentation

Milestone: -

@pentp
Copy link
Contributor Author

pentp commented Nov 10, 2021

So I tried to get the stale PR #37928 working again, but there are two issues that I'm having trouble with:

  1. The TYP_STRUCT returned by the intrinsic is always marked with should not be enregistered because: field of a dependently promoted struct and thus is always stored to memory before it can be used.
  2. As an edge case, compiling MultiplyNoFlags2 itself (through reflection) fails because in this case it does NOT store the tuple in memory and tries to return the value in multiple registers (which isn't possible on win-x64).

JitDump for Math:BigMul(long,long,byref):long

@pentp
Copy link
Contributor Author

pentp commented Nov 10, 2021

Math.BigMul diff is a slight improvement (due to the memory stores/loads), but actually it should generate this:

mulx     rax, rdx, rcx
mov      qword ptr [r8], rdx
ret

@pentp
Copy link
Contributor Author

pentp commented Nov 10, 2021

@EgorBo maybe you have some good hints?

@SingleAccretion
Copy link
Contributor

SingleAccretion commented Nov 10, 2021

The TYP_STRUCT returned by the intrinsic is always marked with should not be enregistered because: field of a dependently promoted struct and thus is always stored to memory before it can be used.

Speaking as a front-end person...

There is an invariant in the Jit that promoted struct locals must be either:

  1. Eliminated from the IR completely (replaced with fields). This is known as "independent" promotion.
  2. Marked as do-not-enregister, so that their constituent field locals do not participate in the SSA-based optimizations, and in the backend, writes to fields affect the parent local.
  3. Treated specially (look for CanBeReplacedWithItsField special handling as one example of this).

As for the immediate cause of spilling, it is because this must be added as a new "special" case, along with the multireg calls, here:

#if FEATURE_MULTIREG_RET
// If this is a multi-reg return, we will not do any morphing of this node.
if (m_src->IsMultiRegCall())
{
assert(m_dst->OperGet() == GT_LCL_VAR);
JITDUMP("Not morphing a multireg call return\n");
m_transformationDecision = BlockTransformation::SkipCallSrc;
m_result = m_asg;
}
else if (m_dst->IsMultiRegLclVar() && !m_src->IsMultiRegNode())
{
m_dst->AsLclVar()->ClearMultiReg();
}
#endif // FEATURE_MULTIREG_RET

@@ -3707,6 +3707,7 @@ public abstract partial class Bmi2 : System.Runtime.Intrinsics.X86.X86Base
public static new bool IsSupported { get { throw null; } }
public static uint MultiplyNoFlags(uint left, uint right) { throw null; }
public unsafe static uint MultiplyNoFlags(uint left, uint right, uint* low) { throw null; }
internal unsafe static (uint Lower, uint Upper) MultiplyNoFlags2(uint left, uint right) { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: changes in /ref are not required for internal APIs.

/// MULX r32a, r32b, reg/m32
/// The above native signature does not directly correspond to the managed signature.
/// </summary>
internal static unsafe (uint Lower, uint Upper) MultiplyNoFlags2(uint left, uint right) => MultiplyNoFlags2(left, right);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is an existing pattern for intrinsic, but would be less confusing if it is something like:
=> throw new NotImplementedException("Intrinsic must be implemented by the runtime.");
instead of recursive call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that the recursiveness is an implementation detail that allows (easy) detection of when the intrinsic must be expanded:

// The recursive non-virtual calls to Jit intrinsics are must-expand by convention.
mustExpand = mustExpand || (gtIsRecursiveCall(method) && !(methodFlags & CORINFO_FLG_VIRTUAL));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Thanks for the link. IMO, it would have made the managed APIs more readable if the detection was based off of some descriptive construct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is actually there to support calls using reflection. It's mostly a theoretical concern, but precludes a need for special handling of reflection for HW intrinsics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, for internal methods throwing might be a valid option, not sure.

@ghost ghost closed this Dec 10, 2021
@ghost
Copy link

ghost commented Dec 10, 2021

Draft Pull Request was automatically closed for inactivity. Please let us know if you'd like to reopen it.

@ghost ghost locked as resolved and limited conversation to collaborators Jan 10, 2022
@pentp pentp deleted the mulx-v2 branch July 1, 2023 18:28
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bmi2.MultiplyNoFlags issues
3 participants