-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64-SVE: gathervector #103159
ARM64-SVE: gathervector #103159
Conversation
Note regarding the
|
1 similar comment
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
This is working. However, 3 tests are commented out. These are the ones with public static unsafe Vector<float> GatherVector(Vector<float> mask, Vector<uint> addresses); I don't think this works for C#. Each address is a 32bit value. Sve is only for Arm64. @tannergooding : thoughts? |
@dotnet/arm64-contrib @kunalspathak |
Looks like two errors:
|
Scratch error 2 - those were due to running on a 256bit machine (and therefore issues around storing to the stack). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good overall. added some comments.
|
||
if (auxSize == EA_8BYTE) | ||
{ | ||
opt = varTypeIsUnsigned(auxType) ? INS_OPTS_SCALABLE_D_UXTW : INS_OPTS_SCALABLE_D_SXTW; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the summary docs, we are emitting
LD1D Zresult.D, Pg/Z, [Xbase, Zindices.D, LSL #3]
,
Correct
which does not take
xs
field, so we do not have to embedUXTW
orSXTW
bit, right?
Removing this, I get an asset (in emitarm64sve.cpp:6441
). Looks like emit is expecting this field. I'll have to debug it a little more. There are a lot of different loads and I don't want to break any of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you are saying that although we pass the *XTW
flag, we are not using it in for this instruction, although the code path flows through the emitarm64sve.cpp
that you mentioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's using the flag to make a decision to decide on the correct encoding. Then doesn't use it later. Probably just needs a fix in emit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just needs a fix in emit.
Did you fix it in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressing modes now fixed. There are three variants for "address with vector of offsets":
Vector<64bit> GatherVector(Vector<64bit> mask, 64bit* address, Vector<64bit> indices)
-> LD1D Zresult.D, Pg/Z, [Xbase, Zindices.D, LSL #3]
Vector<32bit> GatherVector(Vector<32bit> mask, 32bit* address, Vector<signed32bit> indices)
-> LD1W Zresult.S, Pg/Z, [Xbase, Zindices.S, SXTW #2]
Vector<32bit> GatherVector(Vector<32bit> mask, 32bit* address, Vector<unsigned32bit> indices)
-> LD1W Zresult.S, Pg/Z, [Xbase, Zindices.S, UXTW #2]
Codegen now picks the correct ones
src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_Arm.cs
Show resolved
Hide resolved
src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveGatherVectorIndices.template
Show resolved
Hide resolved
src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveGatherVectorIndices.template
Show resolved
Hide resolved
If there is an encoding that allows this parameter to be Having trouble following the SVE docs here since there are like 8+ different instructions called But it looks like
In all cases, it looks like |
This is for:
Which maps to Gather load unsigned words to vector (immediate index) with an immediate value of 0. Unless we can the address of a C# object to fit into a 32bit address I can't see this working. |
I don't see the issue. This doesn't have anything to do with objects, but rather truncated pointers. That is, While it isn't usable for an arbitrary |
Ok, that's fair, I wasn't sure it was possible to restrict memory in that way in C#. Do you have any pointers on how I might construct a blob of memory the first 4GB in C# for the test case? |
You'd need to use an API like It should be roughly equivalent to how you'd test the same functionality in C/C++ |
I don't want to spend too much time on writing test for this scenario, given that we have around 40% APIs to complete in .NET 9. For now, I will just comment out exposing these APIs and come back to it once we have some cycles. |
Agreed - gather vector is likely to be infrequently used. Use of variants with a 32bit address will be much rarer. I've removed them from the external API files, but left them commented out in internal files. That should make it easier to revive later. |
Stress results. |
...braries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Sve.PlatformNotSupported.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, after we put back the 32-bit address APIs with a comment pointing to #103297
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
/ba-g failures are because of #103354 |
No description provided.