-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize "(vec & cns) == zero" on arm64 #102705
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
@EgorBot -arm64 -profiler using BenchmarkDotNet.Attributes;
using System.Buffers;
using System.Text;
using BenchmarkDotNet.Running;
BenchmarkRunner.Run<Perf_Ascii>(args: args);
[DisassemblyDiagnoser(maxDepth: 5)]
public class Perf_Ascii
{
byte[] _bytes = new byte[128];
char[] _characters = new char[128];
[Benchmark]
public OperationStatus ToUtf16() => Ascii.ToUtf16(_bytes, _characters, out _);
} |
Results on Arm64
See BDN_Artifacts.zip for details. 🔥ProfilerFlame graphs: Main vs PR (interactive!) NotesFor clean |
// If op is "vec & cnsVec" where both u64 components in that cnsVec are the same (for both SIMD12 and | ||
// SIMD16) then we'd better do this AND on top of TYP_LONG NI_AdvSimd_Extract in the end - it produces a | ||
// more optimal codegen. | ||
if (op->OperIsHWIntrinsic(NI_AdvSimd_And) && op->AsHWIntrinsic()->Op(2)->OperIs(GT_CNS_VEC)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't have to be a constant right?
Just any (x & y) == zero
or (x & y) != zero
can be optimzied down to a tst
(on both xarch and arm64).
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
Blocks #105047. |
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
Fixes #100922 regression - it was regressed by #99982
Main:
PR: