-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove a few unsafe bits in BCL #111643
base: main
Are you sure you want to change the base?
Remove a few unsafe bits in BCL #111643
Conversation
Tagging subscribers to this area: @dotnet/area-meta |
{ | ||
crc = ArmCrc.Arm64.ComputeCrc32(crc, | ||
Unsafe.ReadUnaligned<ulong>(ref Unsafe.Add(ref ptr, i))); | ||
crc = ArmCrc.Arm64.ComputeCrc32(crc, l); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to generate almost the same codegen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you measured any performance impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be around noise <ns EgorBot/runtime-utils#239
Codegen diff:
; Assembly listing for method System.IO.Hashing.Crc32:UpdateScalarArm32(uint,System.ReadOnlySpan`1[ubyte]):uint
G_M20306_IG01:
push rbp
push rbx
push rax
lea rbp, [rsp+0x10]
;; size=8 bbWeight=1 PerfScore 3.50
G_M20306_IG02:
cmp edx, 4
jl SHORT G_M20306_IG04
;; size=5 bbWeight=1 PerfScore 1.25
G_M20306_IG03:
- mov rbx, rsi
mov eax, edx
and eax, -4
- jg SHORT G_M20306_IG08
cmp eax, edx
- ja SHORT G_M20306_IG07
+ ja SHORT G_M20306_IG08
+ mov ecx, eax
+ shr ecx, 2
+ mov rbx, rsi
+ test ecx, ecx
+ jg SHORT G_M20306_IG07
mov ecx, eax
add rsi, rcx
sub edx, eax
- ;; size=21 bbWeight=0.50 PerfScore 1.88
+ ;; size=28 bbWeight=0.50 PerfScore 2.38
G_M20306_IG04:
- test edx, edx
+ mov eax, edx
+ test eax, eax
jg SHORT G_M20306_IG06
mov eax, edi
- ;; size=6 bbWeight=1 PerfScore 1.50
+ ;; size=8 bbWeight=1 PerfScore 1.75
G_M20306_IG05:
add rsp, 8
pop rbx
pop rbp
ret
;; size=7 bbWeight=1 PerfScore 2.25
G_M20306_IG06:
movzx rsi, byte ptr [rsi]
mov rax, 0xD1FFAB1E ; code for System.Runtime.Intrinsics.Arm.Crc32:ComputeCrc32(uint,ubyte):uint
call [rax]System.Runtime.Intrinsics.Arm.Crc32:ComputeCrc32(uint,ubyte):uint
int3
;; size=17 bbWeight=0.50 PerfScore 2.75
G_M20306_IG07:
- mov rax, 0xD1FFAB1E ; code for System.ThrowHelper:ThrowArgumentOutOfRangeException()
- call [rax]System.ThrowHelper:ThrowArgumentOutOfRangeException()
- int3
- ;; size=13 bbWeight=0.50 PerfScore 1.75
-G_M20306_IG08:
mov esi, dword ptr [rbx]
mov rax, 0xD1FFAB1E ; code for System.Runtime.Intrinsics.Arm.Crc32:ComputeCrc32(uint,uint):uint
call [rax]System.Runtime.Intrinsics.Arm.Crc32:ComputeCrc32(uint,uint):uint
int3
;; size=15 bbWeight=0.50 PerfScore 2.75
+G_M20306_IG08:
+ mov rax, 0xD1FFAB1E ; code for System.ThrowHelper:ThrowArgumentOutOfRangeException()
+ call [rax]System.ThrowHelper:ThrowArgumentOutOfRangeException()
+ int3
+ ;; size=13 bbWeight=0.50 PerfScore 1.75
-; Total bytes of code 92, prolog size 8, PerfScore 17.62, instruction count 33, allocated bytes for code 92 (MethodHash=8ac9b0ad) for method System.IO.Hashing.Crc32:UpdateScalarArm32(uint,System.ReadOnlySpan`1[ubyte]):uint (FullOpts)
+; Total bytes of code 101, prolog size 8, PerfScore 18.38, instruction count 37, allocated bytes for code 101 (MethodHash=8ac9b0ad) for method System.IO.Hashing.Crc32:UpdateScalarArm32(uint,System.ReadOnlySpan`1[ubyte]):uint (FullOpts)
; ============================================================
Just a few suboptimal instructions before the main loop, something to improve in JIT I guess
src/libraries/System.Net.Primitives/src/System/Net/NetworkCredential.cs
Outdated
Show resolved
Hide resolved
PTAL @stephentoub |
Span<byte> destHi = destination.Slice(0, 8); | ||
Span<byte> destLo = destination.Slice(8, 8); | ||
BinaryPrimitives.WriteUInt64BigEndian(destLo, hash.Low64); | ||
BinaryPrimitives.WriteUInt64BigEndian(destHi, hash.High64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, we've had issues in the past with XxHash with inlining not inlining things we expected it to inline because it was hitting inlining budgets. It'd be good to confirm this has no effect on the generated code / throughput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Judging by the diffs it indeed creates problems for inliner 😞
@EgorBot -arm using System.IO.Hashing;
using BenchmarkDotNet.Attributes;
public class Benchmarks
{
[Benchmark]
[ArgumentsSource(nameof(TestData))]
public uint Crc32HashToUInt32(byte[] data) => Crc32.HashToUInt32(data);
public IEnumerable<byte[]> TestData()
{
yield return "1"u8.ToArray();
yield return "44gg"u8.ToArray();
yield return "egfger"u8.ToArray();
yield return "egfger45"u8.ToArray();
yield return "243е3е34пцу4п43п43"u8.ToArray();
yield return "g[perwkg[perjk[pgkr[ekgkerpgkerpgkperkgp[erk[gkre[kg[rekg[erkgpe[rgk"u8.ToArray();
}
} |
A small contribution to #94941