Use BigMul for 32x32=64 in decimal #93345

lilinus · 2023-10-11T15:10:59Z

No need to have a separate implementation in DecCalc

ghost · 2023-10-11T15:31:42Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

No need to have a separate implementation in DecCalc

Author:	lilinus
Assignees:	-
Labels:	`area-System.Numerics`, `community-contribution`, `needs-area-label`
Milestone:	-

tannergooding · 2023-10-11T15:37:46Z

src/libraries/System.Private.CoreLib/src/System/Decimal.DecCalc.cs

@@ -184,20 +184,7 @@ private static ulong UInt32x32To64(uint a, uint b)

            private static void UInt64x64To128(ulong a, ulong b, ref DecCalc result)
            {
-                ulong low = UInt32x32To64((uint)a, (uint)b); // lo partial prod


Can you delete UInt32x32To64 as well?

Could yes, but:

Unfortunately does not exist any BigMul method for two uint32.

It is used in a lot of places where narrowing/casting from ulongs to uints are done, and the code is pretty convoluted already. This shows intent quite nicely

Did you measure the performance of this change? Last time I checked, a similar change caused a significant performance regression because the codegen for Bmi2.MultiplyNoFlags is suboptimal.

tannergooding · 2023-10-11T15:38:47Z

src/libraries/System.Private.CoreLib/src/System/Decimal.DecCalc.cs

@@ -184,20 +184,7 @@ private static ulong UInt32x32To64(uint a, uint b)

            private static void UInt64x64To128(ulong a, ulong b, ref DecCalc result)


This can be replaced, technically speaking, by Math.BigMul(ulong, ulong, out ulong) as well.

Then it can also be deleted.

Yeah sure, NB that overflow check and insertion to DecCalc result are still in body below. Do we prefer to move those around to where this is now called or should i keep this method?

Probably better to keep the extra checks centralized here and just defer the algorithm to Math.BigMul

lilinus · 2023-10-12T13:33:24Z

@dotnet-policy-service agree

tannergooding · 2023-10-12T16:00:32Z

src/libraries/System.Private.CoreLib/src/System/Decimal.DecCalc.cs

@@ -381,7 +376,7 @@ private static uint Div96By64(ref Buf12 bufNum, ulong den)

                // Compute full remainder, rem = dividend - (quo * divisor).
                //
-                ulong prod = UInt32x32To64(quo, (uint)den); // quo * lo divisor
+                ulong prod = quo * (den & uint.MaxValue); // quo * lo divisor


This probably needs a comment on what it's doing.

But, notably, this may regress 32-bit platforms as it will now do a more expensive 64x64=64 multiplication, rather than doing the cheaper 32x32=64.

In general an internal Math.BigMul(uint a, uint b, out uint low) could be defined that uses Bmi2.MultiplyNoFlags, ArmBase.MultiplyHigh, and otherwise falls back to the naive algorithm of (ulong)a * b

Moved all the stuff into an internal ulong Math.BigMul(uint, uint) (since we had a long Math.BigMul(int, int)). I tried using initrinsic for x86 on 32 bit. Doesn't seem to exist a ArmBase.MultiplyHigh yet.

This reverts commit fab4430.

tannergooding

LGTM. Thanks!

tannergooding · 2023-11-03T16:15:27Z

Closing and reopening to requeue CI and try to get the tests all passing. They look unrelated.

pentp · 2023-11-06T13:25:19Z

src/libraries/System.Private.CoreLib/src/System/Math.cs

+            if (Bmi2.IsSupported)
+            {
+                uint low;
+                uint high = Bmi2.MultiplyNoFlags(a, b, &low);


Did you measure the performance for this? It's very likely to be significantly worse than the previous codegen that the JIT automatically generated for uint*uint->ulong multiplication on x86.

It is indeed a lot worse, which is strange since it shouldn't be. mulx is the preferred instruction to do this on x86, at least for cases like this.

Guessing the JIT has various work that needs to be done to ensure it does the right thing.

Sorry no i didn't know how to do benchmarks. Probably we should just remove the intrinsic block for x86 in BigMul methods if they are slow? Same for 64 x 64 = 128

Probably we should just remove the intrinsic block for x86 in BigMul methods if they are slow

Yes, preferably with an issue around it.

Same for 64 x 64 = 128

No. There isn't a primitive type and no corresponding decomp or other work that makes the naive thing more efficient.

There's still more improvements that could be done around Bmi2.X64.MultiplyNoFlags, but nothing that makes it less efficient than the FOIL based fallback.

There's an existing issue for MULX codegen quality: #11782
And a different related issue: #75594

And a 3rd issue that is very closely related to this DecCalc code: #58263

Use BigMul for 32x32=64 in decimal

3fa9604

ghost added the community-contribution Indicates that the PR has been added by a community member label Oct 11, 2023

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 11, 2023

jkotas added the area-System.Numerics label Oct 11, 2023

tannergooding reviewed Oct 11, 2023

View reviewed changes

lewing removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 11, 2023

build-analysis bot mentioned this pull request Oct 11, 2023

ComInterfaceGenerator.Unit.Tests crashes on net9.0-linux-Debug-x64-Mono_release-Ubuntu.2204.Amd64.Open #92070

Open

remove UInt32x32To64

fab4430

tannergooding reviewed Oct 12, 2023

View reviewed changes

build-analysis bot mentioned this pull request Oct 12, 2023

System.Net.Quic.Tests.MsQuicRemoteExecutorTests.SslKeyLogFile_IsCreatedAndFilled fails #93404

Closed

lilinus added 3 commits October 13, 2023 09:28

Revert "remove UInt32x32To64"

6f04db3

This reverts commit fab4430.

Add internal Math.BigMul(uint, uint)

0572988

Use x86 intrinsic in Math.BugMul(uint, uint) for 32 bit

495a8e5

Use unsigned Math.BigMul

8c3f5ca

This was referenced Oct 17, 2023

Test_EventSource_EtwManifestGeneration* tests failing in CI #48798

Closed

System.Data.OleDb.Tests timeout in net48 x86 Release leg #87783

Open

[mono][tvos] OOM in System.IO.Tests.MemoryStreamTests #92467

Closed

tannergooding approved these changes Nov 3, 2023

View reviewed changes

tannergooding closed this Nov 3, 2023

tannergooding reopened this Nov 3, 2023

This was referenced Nov 3, 2023

Timeout in System.Net.Quic.Functional.Tests #86019

Closed

CI error: System.Net.Quic.QuicException: The connection timed out from inactivity #91757

Closed

adamsitnik merged commit 4f26ec9 into dotnet:main Nov 6, 2023
173 of 175 checks passed

adamsitnik added this to the 9.0.0 milestone Nov 6, 2023

pentp reviewed Nov 6, 2023

View reviewed changes

lilinus deleted the decimal-calc-bigmul branch November 13, 2023 13:19

github-actions bot locked and limited conversation to collaborators Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use BigMul for 32x32=64 in decimal #93345

Use BigMul for 32x32=64 in decimal #93345

lilinus commented Oct 11, 2023

ghost commented Oct 11, 2023

tannergooding Oct 11, 2023

lilinus Oct 11, 2023

pentp Nov 6, 2023

tannergooding Oct 11, 2023

lilinus Oct 11, 2023

tannergooding Oct 11, 2023

lilinus commented Oct 12, 2023

tannergooding Oct 12, 2023 •

edited

Loading

lilinus Oct 13, 2023

tannergooding left a comment

tannergooding commented Nov 3, 2023

pentp Nov 6, 2023

tannergooding Nov 6, 2023

lilinus Nov 6, 2023

tannergooding Nov 6, 2023

pentp Nov 6, 2023 •

edited

Loading

pentp Nov 6, 2023

		@@ -184,20 +184,7 @@ private static ulong UInt32x32To64(uint a, uint b)

		private static void UInt64x64To128(ulong a, ulong b, ref DecCalc result)

Use BigMul for 32x32=64 in decimal #93345

Use BigMul for 32x32=64 in decimal #93345

Conversation

lilinus commented Oct 11, 2023

ghost commented Oct 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lilinus commented Oct 12, 2023

tannergooding Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding left a comment

Choose a reason for hiding this comment

tannergooding commented Nov 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pentp Nov 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tannergooding Oct 12, 2023 •

edited

Loading

pentp Nov 6, 2023 •

edited

Loading