Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Windows return simd #32278

Closed
wants to merge 13 commits into from

Conversation

davidwrighton
Copy link
Member

No description provided.

if (jitMgr->LoadJIT())
{
CORJIT_FLAGS cpuCompileFlags = jitMgr->GetCPUCompileFlags();
if (cpuCompileFlags.IsSet(CORJIT_FLAGS::CORJIT_FLAG_FEATURE_SIMD))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not feel right to vary calling convention based on these flags. Would it be better for CORJIT_FLAG_FEATURE_SIMD to only control whether hardware intrinsics are enable, but not alter calling convention?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we don't have support for any of the TYP_SIMD stuff without this flag. So there might be a bit more involved in recognizing the type and getting things to work in that scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this flag be unconditionally set on Windows x64 then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that is currently the case for the mainline build:

if (CLR_CMAKE_TARGET_ARCH_AMD64 OR CLR_CMAKE_TARGET_ARCH_ARM64 OR (CLR_CMAKE_TARGET_ARCH_I386 AND NOT CLR_CMAKE_HOST_UNIX))
add_definitions(-DFEATURE_SIMD)
add_definitions(-DFEATURE_HW_INTRINSICS)
endif ()

@CarolEidt might know if there are any cases where it is or could be disabled (at least from a quick grep, it is disabled for armelnonjit, linuxnonjit, and protononjit).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this weird quirk for the no simd testing scenario, but as you say, its not really supported or generally tested. Would there be objection to removing the ability to turn off the ability to disable feature simd on Windows X64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unlikely we will ever support an X86 platform where we don't have basic SSE support, as such I think disabling SIMD support doesn't make much sense except for esoteric test scenarios that it turns out we don't use much anyways. A much more interesting question is for ARM64 on Linux and ARM32.

Arm32 doesn't support simd today, and thus an abi shift based on FEATURE_SIMD support will be necessary for when we add support for Vector<T>, as the abi of Vector<T> is fundamentally based on using a vector register. However, I think it is reasonable for the Vector abi to vary based on weird switches, as it really is supposed to be dynamic.

For Arm64, we support AdvSimd and floating point in the baseline, and according to the comment in pal\src\misc\jitsupport.cpp, we don't support platforms without that support, but we do actually have the checks that call into the kernel to ask if such support exists. It would be quite reasonable for a Vector<T> to use SVE in the future, and using SVE might imply a different calling convention. That adjusted calling convention would be problematic, as the architectural max size of a SVE register is 2048 bits, and would thus bloat our transition block structure enormously.

As far as "supported calling convention on a platform", well, the SysV Intel abi on Linux/Mac is a bit peculiar around this. It allows for use of the registers as long as all compiled is compiled as machine aware, but the struct types work regardless. For instance, if you compile a library as Avx aware in clang on Linux, and expose an abi that manipulates an __m256, and call that function from another binary that was not compiled as Avx aware but does use the __m256 structure, then the same C++ compiler will generate conflicting abis, and things will break. Exposing interop support, will cause us to need to be able to talk to such issues, and reasons like that are why this work will not directly enable vector marshalling support.

For Windows, matters are a bit simpler. On Windows the C++ abi does not vary by compiler switch. The compiler switches will however, drive if a __m256 type is useable at all.

Arm32/64 abi requires use of SIMD registers for the SIMD types. Vector128<T> should only be passed in SIMD registers, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no simd testing scenario

We should have a way to disable hardware accelerated Vector and the hardware intrinsics IsSupported methods so that people can test their fallback paths. But the calling convention can stay the same, even when these are disabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jkotas here - we need to have a way to test fallback paths (though I would note that this should really only impact the ability to test those fallback paths on a platform that would otherwise not require them), and it may be that changing the behavior of FEATURE_SIMD in this way will require additional changes, but if so those changes should be made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a way to disable hardware accelerated Vector and the hardware intrinsics IsSupported methods so that people can test their fallback paths.

In the case of HWIntrinsics, this is done via the COMPlus_Enable{ISA}=0 or COMPlus_EnableHWIntrinsic=0 flags (see

CONFIG_INTEGER(EnableHWIntrinsic, W("EnableHWIntrinsic"), 1) // Enable Base
CONFIG_INTEGER(EnableSSE, W("EnableSSE"), 1) // Enable SSE
CONFIG_INTEGER(EnableSSE2, W("EnableSSE2"), 1) // Enable SSE2
CONFIG_INTEGER(EnableSSE3, W("EnableSSE3"), 1) // Enable SSE3
CONFIG_INTEGER(EnableSSSE3, W("EnableSSSE3"), 1) // Enable SSSE3
CONFIG_INTEGER(EnableSSE41, W("EnableSSE41"), 1) // Enable SSE41
CONFIG_INTEGER(EnableSSE42, W("EnableSSE42"), 1) // Enable SSE42
CONFIG_INTEGER(EnableAVX, W("EnableAVX"), 1) // Enable AVX
CONFIG_INTEGER(EnableAVX2, W("EnableAVX2"), 1) // Enable AVX2
CONFIG_INTEGER(EnableFMA, W("EnableFMA"), 1) // Enable FMA
CONFIG_INTEGER(EnableAES, W("EnableAES"), 1) // Enable AES
CONFIG_INTEGER(EnableBMI1, W("EnableBMI1"), 1) // Enable BMI1
CONFIG_INTEGER(EnableBMI2, W("EnableBMI2"), 1) // Enable BMI2
CONFIG_INTEGER(EnableLZCNT, W("EnableLZCNT"), 1) // Enable AES
CONFIG_INTEGER(EnablePCLMULQDQ, W("EnablePCLMULQDQ"), 1) // Enable PCLMULQDQ
CONFIG_INTEGER(EnablePOPCNT, W("EnablePOPCNT"), 1) // Enable POPCNT
#else // !defined(TARGET_AMD64) && !defined(TARGET_X86)
// Enable AVX instruction set for wide operations as default
CONFIG_INTEGER(EnableAVX, W("EnableAVX"), 0)
#endif // !defined(TARGET_AMD64) && !defined(TARGET_X86)
// clang-format off
#if defined(TARGET_ARM64)
CONFIG_INTEGER(EnableHWIntrinsic, W("EnableHWIntrinsic"), 1)
CONFIG_INTEGER(EnableArm64Aes, W("EnableArm64Aes"), 1)
CONFIG_INTEGER(EnableArm64Atomics, W("EnableArm64Atomics"), 1)
CONFIG_INTEGER(EnableArm64Crc32, W("EnableArm64Crc32"), 1)
CONFIG_INTEGER(EnableArm64Dcpop, W("EnableArm64Dcpop"), 1)
CONFIG_INTEGER(EnableArm64Dp, W("EnableArm64Dp"), 1)
CONFIG_INTEGER(EnableArm64Fcma, W("EnableArm64Fcma"), 1)
CONFIG_INTEGER(EnableArm64Fp, W("EnableArm64Fp"), 1)
CONFIG_INTEGER(EnableArm64Fp16, W("EnableArm64Fp16"), 1)
CONFIG_INTEGER(EnableArm64Jscvt, W("EnableArm64Jscvt"), 1)
CONFIG_INTEGER(EnableArm64Lrcpc, W("EnableArm64Lrcpc"), 1)
CONFIG_INTEGER(EnableArm64Pmull, W("EnableArm64Pmull"), 1)
CONFIG_INTEGER(EnableArm64Sha1, W("EnableArm64Sha1"), 1)
CONFIG_INTEGER(EnableArm64Sha256, W("EnableArm64Sha256"), 1)
CONFIG_INTEGER(EnableArm64Sha512, W("EnableArm64Sha512"), 1)
CONFIG_INTEGER(EnableArm64Sha3, W("EnableArm64Sha3"), 1)
CONFIG_INTEGER(EnableArm64AdvSimd, W("EnableArm64AdvSimd"), 1)
CONFIG_INTEGER(EnableArm64AdvSimd_v81, W("EnableArm64AdvSimd_v81"), 1)
CONFIG_INTEGER(EnableArm64AdvSimd_Fp16, W("EnableArm64AdvSimd_Fp16"), 1)
CONFIG_INTEGER(EnableArm64Sm3, W("EnableArm64Sm3"), 1)
CONFIG_INTEGER(EnableArm64Sm4, W("EnableArm64Sm4"), 1)
CONFIG_INTEGER(EnableArm64Sve, W("EnableArm64Sve"), 1)
#endif // defined(TARGET_ARM64)
).
I don't believe we have a flag for Vector<T> today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct; that's what FEATURE_SIMD was generally used for, and I was just indicating that we need a way to disable Vector<T> without impacting the calling convention, and that doing so will probably require additional changes.

@@ -1550,6 +1550,29 @@ void ArgIteratorTemplate<ARGITERATOR_BASE>::ComputeReturnFlags()
}
#endif

#if defined(_TARGET_AMD64_) && !defined(UNIX_AMD64_ABI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if defined(_TARGET_AMD64_) && !defined(UNIX_AMD64_ABI)
#if defined(TARGET_AMD64) && !defined(UNIX_AMD64_ABI)

Recent rename

}
else if ((norm_struct_type == TYP_SIMD32) && (vectorRegSizeForReturn == 32))
{
// TYP_SIMD16 should be returned in YMM0
Copy link
Member

@cshung cshung May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: TYP_SIMD32 (This is one more occurrence of this below)

@ViktorHofer
Copy link
Member

// Auto-generated message

69e114c which was merged 12/7 removed the intermediate src/coreclr/src/ folder. This PR needs to be updated as it touches files in that directory which causes conflicts.

To update your commits you can use this bash script: https://gist.github.com/ViktorHofer/6d24f62abdcddb518b4966ead5ef3783. Feel free to use the comment section of the gist to improve the script for others.

@ghost ghost closed this Jan 25, 2021
@ghost
Copy link

ghost commented Jan 25, 2021

Draft Pull Request was automatically closed for inactivity. It can be manually reopened in the next 30 days if the work resumes.

@ghost ghost locked as resolved and limited conversation to collaborators Feb 24, 2021
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants