Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating BigInteger to implement IBinaryInteger and ISignedNumber #68964

Merged
merged 4 commits into from
May 11, 2022

Conversation

tannergooding
Copy link
Member

No description provided.

@ghost ghost assigned tannergooding May 6, 2022
@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented May 6, 2022

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: tannergooding
Assignees: tannergooding
Labels:

area-System.Numerics, new-api-needs-documentation

Milestone: -

for (int i = 0; i < value._bits.Length; i++)
{
uint part = value._bits[i];
result += uint.PopCount(part);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth reinterpreting _bits as ulongs and doing a PopCount on each of those (plus any additional uint that remains)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be better to use nuint (so 32-bit perf stays good), but there is a more general rewrite of BigInteger planned at some point and I think it would be better to do that then rather than complicate the current logic more.

  • The general goal of the rewrite is to make BigInteger use nuint[] (rather than uint[]) and make it two's complement, rather than one's complement representation

int byteCount = (value._bits is null) ? sizeof(int) : (value._bits.Length * 4);

// Normalize the rotate amount to drop full rotations
rotateAmount %= (int)(byteCount * 8L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guaranteed to handle overflow (negative byteCount) correctly? Some very limited testing says yes, but I wondered if you knew if this was a solid guarantee.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, although its possible there is some edge case.

More concretely we have APIs that allow getting the underlying bytes into an array or span, and so long term I want to limit _bits.Length to be no more than Array.MaxLength bytes. This ensures you can always fit the underlying data into a span and 2^2_147_483_591 should be more than big enoguh for anyone.

The shifting and rotate APIs are then somewhat "oddities" in that they currently take int but theoretically we allow up to Array.MaxLength * 8 bits, which is currently up to 17,179,868,728 bits (requiring up to 34 bits).

If we limit the actual value to no more than int.MaxLength bits so these APIs continue to make sense then we allow BigInteger up to 268_435_455 bytes or approximately 256 MB. This allows an astronomically large number of approx: 7.15 * 10^80_807_123.

I'm not necessarily "hardset" on either of these limits, but I think they are practical and realistic for .NET. Elsewise we need "chunking" APIs and various APIs may throw or not function as expected in all scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added handling to ensure that _bits.Length <= (Array.Length / sizeof(uint)) and to throw an OverflowException otherwise.

Changed the modulus logic to then be rotateAmount = (int)(rotateAmount % (byteCount * 8L))

}

/// <inheritdoc cref="IBinaryInteger{TSelf}.LeadingZeroCount(TSelf)" />
public static BigInteger LeadingZeroCount(BigInteger value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change BigInteger to use nint[] instead of int[] in a future PR (per your comment at https://github.com/dotnet/runtime/pull/68964/files#r866933097), you'll need to normalize the result of this method across platforms. Otherwise the results could differ on x86 vs x64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However, I believe that's fine/acceptable.

Various types, including user defined ones, may be variable sized. As such, LeadingZeroCount is logically a value between 0 and GetByteCount() * 8L (put another way it is (GetByteCount() * 8L) - GetShortestBitLength().

This is functionally similar to how nint, nuint, CLong, and CULong work as well and gives a well-defined contract even in the face of this. The value is then still usable within these limits/expectations.

{
// When the value is positive, we simply need to copy all bits as little endian

ref byte address = ref MemoryMarshal.GetReference(destination);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for GetReference in the common little-endian case. If the length check on line 3308 above succeeds, you know that MemoryMarshal.Cast<uint, byte>(bits).CopyTo(destination) will also succeed. The only time you'd need to bswap is in the big-endian case.

Once your two's-complement implementation comes online, you'll be able to reuse the same memcpy code there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to intentionally keep this simple for now and handle it as part of the rewrite.

}

/// <inheritdoc cref="INumber{TSelf}.MaxMagnitude(TSelf, TSelf)" />
public static BigInteger MaxMagnitude(BigInteger x, BigInteger y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perf: This method could end up allocating as part of the temporary -x and -y calculations, even though the results are discarded. Is this acceptable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BigInteger, in general, allocates all over the place today. I'm not overly concerned about this but I definitely could simplify it a bit more.

Copy link
Contributor

@dakersnar dakersnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Only skimmed the test, but they look robust.

@@ -603,6 +621,8 @@ private BigInteger(Span<uint> value)

public static BigInteger MinusOne { get { return s_bnMinusOneInt; } }

internal static int MaxLength => Array.MaxLength / sizeof(uint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this restriction designed so that we can hold every BigInteger in an array of bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and based on the feedback from API review we are going to restrict this further such that MaxLength => Array.MaxLength / (sizeof(uint) * 8)

That is, it is expected that any BigInteger be no more than approx 2 billion bits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the summary I gave above (#68964 (comment))

This is up to 268_435_455 bytes or approximately 256 MB. This still allows an astronomically large number of approx: 7.15 * 10^80_807_123 at which point

Most conceivable code will fit well into 80_807_100 (80 million) digits provided and anything beyond is increasingly niche and starts having very complex considerations into how it impacts the GC, memory usage in general, etc.

@@ -1766,6 +1786,31 @@ private static BigInteger Subtract(ReadOnlySpan<uint> leftBits, int leftSign, Re
return new BigInteger(value);
}

public static implicit operator BigInteger(nint value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this operator keyword referring to the cast operation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Most operators are of the form static TResult operator ... where the ... is the operator name (+, -, *, /, %, etc; including the new checked variants checked +, etc).

Implicit and explicit conversion operators take a different shape of the form static implicit operator TResult and static explicit operator TResult (with the checked keyword still following the operator keyword, so static explicit operator checked TResult).

//

/// <inheritdoc cref="IAdditionOperators{TSelf, TOther, TResult}.op_Addition(TSelf, TOther)" />
static BigInteger IAdditionOperators<BigInteger, BigInteger, BigInteger>.operator checked +(BigInteger left, BigInteger right) => left + right;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed now that checked + has a DIM in IAdditionOperators with this behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't be once the DIM support is available for use by the language. As of right now, its not in or usable at the moment so we still need to define this ourselves.

@tannergooding tannergooding merged commit b4f46ac into dotnet:main May 11, 2022
@uweigand
Copy link
Contributor

Hi @tannergooding, the new test is failing on big-endian s390x:
https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-2b43123da88d4ef296/System.Runtime.Numerics.Tests/1/console.a4a405ef.log?helixlogtype=result

Seems like there could be again endian issues? In particular, it seems odd that TryWriteLittleEndian apparently byte-swaps positive numbers, but not negative numbers ...

@tannergooding
Copy link
Member Author

This is fixed in #69391.

@ghost ghost locked as resolved and limited conversation to collaborators Jun 15, 2022
@tannergooding tannergooding deleted the generic-math-biginteger branch November 11, 2022 15:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants