Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor cleanup of the Vector64/128/256/512 implementations to improve fallbacks #103095

Merged
merged 5 commits into from
Jun 7, 2024

Conversation

tannergooding
Copy link
Member

Continuation of #102301, just cleaning up some of the logic to require less inlining and defer to the same size core API for the fallback implementation where trivially possible.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 5, 2024
Vector256.Divide(left._upper, right._upper)
);
}
public static Vector512<T> Divide<T>(Vector512<T> left, Vector512<T> right) => left / right;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cases like this were where size regressions were coming from. This ensures that instead we hit the intrinsic path when we're accelerated and otherwise we get the same overall codegen as before.

@@ -1988,7 +2004,7 @@ public static Vector512<T> Min<T>(Vector512<T> left, Vector512<T> right)
/// <returns>The product of <paramref name="left" /> and <paramref name="right" />.</returns>
/// <exception cref="NotSupportedException">The type of <paramref name="left" /> and <paramref name="right" /> (<typeparamref name="T" />) is not supported.</exception>
[Intrinsic]
public static Vector512<T> Multiply<T>(T left, Vector512<T> right) => left * right;
public static Vector512<T> Multiply<T>(T left, Vector512<T> right) => right * left;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cases like this aren't strictly needed, but it avoids needing +1 inlining step when we aren't accelerated and defers to the core algorithm instead.

This overload is just convenience so that scalar * vector and vector * scalar both work, since the operation is commutative, so there's no optimization opportunity missing here either.

@tannergooding tannergooding added area-System.Runtime.Intrinsics and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 5, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@tannergooding
Copy link
Member Author

@MihuBot

@tannergooding
Copy link
Member Author

Size regressions in APIs like ISimdVector.Divide is because we're now emitting the actual div instruction instead of a call to Scalar<T>.Div.

For Sum, it's because the implementation was made to be correct. It was buggy for floating-point before and was non-deterministic between direct and indirect invocations. This in turn causes extra instructions to be emitted in a number of APIs.

There's a few places where the fallbacks involving AndNot are worse because the JIT gives up on recognizing x & ~y in this particular case (it currently gives up for reverse ops or any potential side effect flag).

The meaningful places that will execute when IsHardwareAccelerated == true have improved as expected

@tannergooding
Copy link
Member Author

CC. @EgorBo

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tannergooding tannergooding merged commit e012fd4 into dotnet:main Jun 7, 2024
164 of 166 checks passed
@tannergooding tannergooding deleted the proto-102275 branch June 7, 2024 17:36
@github-actions github-actions bot locked and limited conversation to collaborators Jul 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants