Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add carrying_add, borrowing_sub, widening_mul, carrying_mul methods to integers #85017

Merged
merged 1 commit into from
Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions library/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
#![feature(const_alloc_layout)]
#![feature(const_arguments_as_str)]
#![feature(const_assert_type)]
#![feature(const_bigint_helper_methods)]
#![feature(const_caller_location)]
#![feature(const_cell_into_inner)]
#![feature(const_discriminant)]
Expand Down
54 changes: 54 additions & 0 deletions library/core/src/num/int_macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1307,6 +1307,33 @@ macro_rules! int_impl {
(a as Self, b)
}

/// Calculates `self + rhs + carry` without the ability to overflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ability to overflow" sounds like overflow is desirable and the lack of overflow is a limitation of this method. Is that what you meant?

Copy link
Contributor

@AaronKutch AaronKutch Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say

Extended summation of `self + rhs + carry`. The booleans are interpreted as a single bit
integer of value 0 or 1. If unsigned overflow occurs, then the boolean in the tuple returns 1. This
output carry can be chained into the input carry of another carrying add, which allows
for arbitrarily large additions to be calculated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Since this PR is merged, this discussion should go in the tracking issue (#85532).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait pardon me this has already been merged. I will take discussion elsewhere or open my own PR.

///
/// Performs "ternary addition" which takes in an extra bit to add, and may return an
/// additional bit of overflow. This allows for chaining together multiple additions
/// to create "big integers" which represent larger values.
///
/// # Examples
///
/// Basic usage
///
/// ```
/// #![feature(bigint_helper_methods)]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".carrying_add(2, false), (7, false));")]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".carrying_add(2, true), (8, false));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MAX.carrying_add(1, false), (", stringify!($SelfT), "::MIN, false));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MAX.carrying_add(1, true), (", stringify!($SelfT), "::MIN + 1, false));")]
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "const_bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn carrying_add(self, rhs: Self, carry: bool) -> (Self, bool) {
let (sum, carry) = (self as $UnsignedT).carrying_add(rhs as $UnsignedT, carry);
(sum as $SelfT, carry)
}

/// Calculates `self` - `rhs`
///
/// Returns a tuple of the subtraction along with a boolean indicating whether an arithmetic overflow
Expand All @@ -1331,6 +1358,33 @@ macro_rules! int_impl {
(a as Self, b)
}

/// Calculates `self - rhs - borrow` without the ability to overflow.
///
/// Performs "ternary subtraction" which takes in an extra bit to subtract, and may return
/// an additional bit of overflow. This allows for chaining together multiple subtractions
/// to create "big integers" which represent larger values.
///
/// # Examples
///
/// Basic usage
///
/// ```
/// #![feature(bigint_helper_methods)]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".borrowing_sub(2, false), (3, false));")]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".borrowing_sub(2, true), (2, false));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MIN.borrowing_sub(1, false), (", stringify!($SelfT), "::MAX, false));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MIN.borrowing_sub(1, true), (", stringify!($SelfT), "::MAX - 1, false));")]
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "const_bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn borrowing_sub(self, rhs: Self, borrow: bool) -> (Self, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To a Rustacean, "borrowing" sounds like it has to do with references. So I am very confused by the name of this function. Maybe the docs should explain it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word borrow here comes from the terminology for an full subtractor. I am thinking that maybe the borrowing_sub function should be removed altogether. The same effect that borrowing_sub has can be obtained from carrying_add by making the first carrying_add in the chain have a set carry bit, and then bitnot every rhs. This fact could be put in the documentation of carrying_add.

let (sum, borrow) = (self as $UnsignedT).borrowing_sub(rhs as $UnsignedT, borrow);
(sum as $SelfT, borrow)
}

/// Calculates the multiplication of `self` and `rhs`.
///
/// Returns a tuple of the multiplication along with a boolean indicating whether an arithmetic overflow
Expand Down
87 changes: 87 additions & 0 deletions library/core/src/num/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -89,27 +89,104 @@ depending on the target pointer size.
};
}

macro_rules! widening_impl {
($SelfT:ty, $WideT:ty, $BITS:literal) => {
/// Calculates the complete product `self * rhs` without the possibility to overflow.
///
/// This returns the low-order (wrapping) bits and the high-order (overflow) bits
/// of the result as two separate values, in that order.
///
/// # Examples
///
/// Basic usage:
///
/// Please note that this example is shared between integer types.
/// Which explains why `u32` is used here.
///
/// ```
/// #![feature(bigint_helper_methods)]
/// assert_eq!(5u32.widening_mul(2), (10, 0));
m-ou-se marked this conversation as resolved.
Show resolved Hide resolved
/// assert_eq!(1_000_000_000u32.widening_mul(10), (1410065408, 2));
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "const_bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn widening_mul(self, rhs: Self) -> (Self, Self) {
// note: longer-term this should be done via an intrinsic,
// but for now we can deal without an impl for u128/i128
// SAFETY: overflow will be contained within the wider types
let wide = unsafe { (self as $WideT).unchecked_mul(rhs as $WideT) };
(wide as $SelfT, (wide >> $BITS) as $SelfT)
}
Comment on lines +116 to +122
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe widening_mul on signed integers should return the less-significant half as unsigned instead. For example consider i8, and −128 × 3 (0x80 * 0x03). Now, when widened to i16 (0xff80 * 0x0003), this gives −384 (0xfe80), which is then returned as (0x80, 0xfe). Now, 0xfe itself is −2, so think of this as meaning −512 because it is the higher word. The correct result is −512 + 128, so the 0x80 should be interpreted as +128 not −128, so it would be u8, not i8.

Note: I'm using unsigned for all hex values to not make it more confusing, even though technically 0xfe_i8 is an error and would be written as -0x02_i8.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, using unsigned for the lower half would make it possible to write something like real_result = (i16::from(high) << 8) | i16::from(low), or + instead of |, both work. With signed for the lower half, i16::from(low) would be sign extended and would actually need to be masked with something like (i16::from(low) & i16::from(u8::MAX)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add, these kinds of suggestions are exactly the kind that I would love to see discussed, and I would definitely suggest bringing it up in the original RFC for widening_mul. Right now I'm kind of inclined to keep this particular method as-is until it's merged and this discussion can happen before stabilisation, since the main goal is to get something out there for people to use at the moment.

I am fine with making this change before stabilisation though if anyone else would like to +1 the idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FWIW, my justification for using the same type is because any big-integer implementation that uses these would be adding one half or the other to some part of the bigint anyway, and using a separate unsigned type for that would just require unnecessary casting. I could easily be swayed, though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked into the internal representation of two bigint implementations: GMP and num-bigint. Both store the digits as unsigned and use an extra flag to indicate whether the big number is positive or negative, so I don't know how an actual bignum implementation that stores digits as signed would use these, if there is such a bignum implementation.

Copy link
Contributor Author

@clarfonthey clarfonthey May 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did notice that most bigint implementations do use a sign flag instead of two's complement. However, in my case, I was working on an implementation of a variable-length integer encoding where integers would still be stored in two's complement as it would make sign extension to get a larger, single unit (e.g. u64 or u128) easier. It probably does make sense for most bigint implementations to use a sign flag and unsigned int units, but that doesn't mean that two's complement bigints are completely out of the question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the ordering of the return tuple: I guess it makes sense to be consistent with other methods who's carry is the last element of the return tuple, and as such have the most significant word on the right. Personally, I prefer the most significant word on the left; then one can write nice idiomatic code e.g. to compare x * y with a * b with x.widening_mul(y).cmp(&a.widening_mul(b)).

Copy link
Member

@scottmcm scottmcm Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think that add_with_carry matching the order of overflowing_add is the most important part.

But interestingly, the least-significant word on the left has a nice interpretation on (currently more common) little-endian machines: the return is just the number for wide_mul (assuming the obvious, though unspecified, tuple layout): https://rust.godbolt.org/z/sG9E4G8vY

(Though apparently we no longer encode (u32, u32) as i64 in LLVM, which is why the Godbolt demo uses an old rustc version, so maybe that matters less these days.)


/// Calculates the "full multiplication" `self * rhs + carry`
/// without the possibility to overflow.
///
/// This returns the low-order (wrapping) bits and the high-order (overflow) bits
/// of the result as two separate values, in that order.
///
/// Performs "long multiplication" which takes in an extra amount to add, and may return an
/// additional amount of overflow. This allows for chaining together multiple
/// multiplications to create "big integers" which represent larger values.
///
/// # Examples
///
/// Basic usage:
///
/// Please note that this example is shared between integer types.
/// Which explains why `u32` is used here.
///
/// ```
/// #![feature(bigint_helper_methods)]
/// assert_eq!(5u32.carrying_mul(2, 0), (10, 0));
/// assert_eq!(5u32.carrying_mul(2, 10), (20, 0));
/// assert_eq!(1_000_000_000u32.carrying_mul(10, 0), (1410065408, 2));
/// assert_eq!(1_000_000_000u32.carrying_mul(10, 10), (1410065418, 2));
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn carrying_mul(self, rhs: Self, carry: Self) -> (Self, Self) {
// note: longer-term this should be done via an intrinsic,
// but for now we can deal without an impl for u128/i128
// SAFETY: overflow will be contained within the wider types
let wide = unsafe {
(self as $WideT).unchecked_mul(rhs as $WideT).unchecked_add(carry as $WideT)
};
(wide as $SelfT, (wide >> $BITS) as $SelfT)
}
};
}

#[lang = "i8"]
impl i8 {
widening_impl! { i8, i16, 8 }
int_impl! { i8, i8, u8, 8, 7, -128, 127, 2, "-0x7e", "0xa", "0x12", "0x12", "0x48",
"[0x12]", "[0x12]", "", "" }
}

#[lang = "i16"]
impl i16 {
widening_impl! { i16, i32, 16 }
int_impl! { i16, i16, u16, 16, 15, -32768, 32767, 4, "-0x5ffd", "0x3a", "0x1234", "0x3412",
"0x2c48", "[0x34, 0x12]", "[0x12, 0x34]", "", "" }
}

#[lang = "i32"]
impl i32 {
widening_impl! { i32, i64, 32 }
int_impl! { i32, i32, u32, 32, 31, -2147483648, 2147483647, 8, "0x10000b3", "0xb301",
"0x12345678", "0x78563412", "0x1e6a2c48", "[0x78, 0x56, 0x34, 0x12]",
"[0x12, 0x34, 0x56, 0x78]", "", "" }
}

#[lang = "i64"]
impl i64 {
widening_impl! { i64, i128, 64 }
int_impl! { i64, i64, u64, 64, 63, -9223372036854775808, 9223372036854775807, 12,
"0xaa00000000006e1", "0x6e10aa", "0x1234567890123456", "0x5634129078563412",
"0x6a2c48091e6a2c48", "[0x56, 0x34, 0x12, 0x90, 0x78, 0x56, 0x34, 0x12]",
Expand All @@ -131,6 +208,7 @@ impl i128 {
#[cfg(target_pointer_width = "16")]
#[lang = "isize"]
impl isize {
widening_impl! { isize, i32, 16 }
int_impl! { isize, i16, usize, 16, 15, -32768, 32767, 4, "-0x5ffd", "0x3a", "0x1234",
"0x3412", "0x2c48", "[0x34, 0x12]", "[0x12, 0x34]",
usize_isize_to_xe_bytes_doc!(), usize_isize_from_xe_bytes_doc!() }
Expand All @@ -139,6 +217,7 @@ impl isize {
#[cfg(target_pointer_width = "32")]
#[lang = "isize"]
impl isize {
widening_impl! { isize, i64, 32 }
int_impl! { isize, i32, usize, 32, 31, -2147483648, 2147483647, 8, "0x10000b3", "0xb301",
"0x12345678", "0x78563412", "0x1e6a2c48", "[0x78, 0x56, 0x34, 0x12]",
"[0x12, 0x34, 0x56, 0x78]",
Expand All @@ -148,6 +227,7 @@ impl isize {
#[cfg(target_pointer_width = "64")]
#[lang = "isize"]
impl isize {
widening_impl! { isize, i128, 64 }
int_impl! { isize, i64, usize, 64, 63, -9223372036854775808, 9223372036854775807,
12, "0xaa00000000006e1", "0x6e10aa", "0x1234567890123456", "0x5634129078563412",
"0x6a2c48091e6a2c48", "[0x56, 0x34, 0x12, 0x90, 0x78, 0x56, 0x34, 0x12]",
Expand All @@ -160,6 +240,7 @@ const ASCII_CASE_MASK: u8 = 0b0010_0000;

#[lang = "u8"]
impl u8 {
widening_impl! { u8, u16, 8 }
uint_impl! { u8, u8, 8, 255, 2, "0x82", "0xa", "0x12", "0x12", "0x48", "[0x12]",
"[0x12]", "", "" }

Expand Down Expand Up @@ -693,18 +774,21 @@ impl u8 {

#[lang = "u16"]
impl u16 {
widening_impl! { u16, u32, 16 }
uint_impl! { u16, u16, 16, 65535, 4, "0xa003", "0x3a", "0x1234", "0x3412", "0x2c48",
"[0x34, 0x12]", "[0x12, 0x34]", "", "" }
}

#[lang = "u32"]
impl u32 {
widening_impl! { u32, u64, 32 }
uint_impl! { u32, u32, 32, 4294967295, 8, "0x10000b3", "0xb301", "0x12345678",
"0x78563412", "0x1e6a2c48", "[0x78, 0x56, 0x34, 0x12]", "[0x12, 0x34, 0x56, 0x78]", "", "" }
}

#[lang = "u64"]
impl u64 {
widening_impl! { u64, u128, 64 }
uint_impl! { u64, u64, 64, 18446744073709551615, 12, "0xaa00000000006e1", "0x6e10aa",
"0x1234567890123456", "0x5634129078563412", "0x6a2c48091e6a2c48",
"[0x56, 0x34, 0x12, 0x90, 0x78, 0x56, 0x34, 0x12]",
Expand All @@ -727,13 +811,15 @@ impl u128 {
#[cfg(target_pointer_width = "16")]
#[lang = "usize"]
impl usize {
widening_impl! { usize, u32, 16 }
uint_impl! { usize, u16, 16, 65535, 4, "0xa003", "0x3a", "0x1234", "0x3412", "0x2c48",
"[0x34, 0x12]", "[0x12, 0x34]",
usize_isize_to_xe_bytes_doc!(), usize_isize_from_xe_bytes_doc!() }
}
#[cfg(target_pointer_width = "32")]
#[lang = "usize"]
impl usize {
widening_impl! { usize, u64, 32 }
uint_impl! { usize, u32, 32, 4294967295, 8, "0x10000b3", "0xb301", "0x12345678",
"0x78563412", "0x1e6a2c48", "[0x78, 0x56, 0x34, 0x12]", "[0x12, 0x34, 0x56, 0x78]",
usize_isize_to_xe_bytes_doc!(), usize_isize_from_xe_bytes_doc!() }
Expand All @@ -742,6 +828,7 @@ impl usize {
#[cfg(target_pointer_width = "64")]
#[lang = "usize"]
impl usize {
widening_impl! { usize, u128, 64 }
uint_impl! { usize, u64, 64, 18446744073709551615, 12, "0xaa00000000006e1", "0x6e10aa",
"0x1234567890123456", "0x5634129078563412", "0x6a2c48091e6a2c48",
"[0x56, 0x34, 0x12, 0x90, 0x78, 0x56, 0x34, 0x12]",
Expand Down
60 changes: 60 additions & 0 deletions library/core/src/num/uint_macros.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1378,6 +1378,36 @@ macro_rules! uint_impl {
(a as Self, b)
}

/// Calculates `self + rhs + carry` without the ability to overflow.
///
/// Performs "ternary addition" which takes in an extra bit to add, and may return an
/// additional bit of overflow. This allows for chaining together multiple additions
/// to create "big integers" which represent larger values.
///
/// # Examples
///
/// Basic usage
///
/// ```
/// #![feature(bigint_helper_methods)]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".carrying_add(2, false), (7, false));")]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".carrying_add(2, true), (8, false));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MAX.carrying_add(1, false), (0, true));")]
#[doc = concat!("assert_eq!(", stringify!($SelfT), "::MAX.carrying_add(1, true), (1, true));")]
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "const_bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn carrying_add(self, rhs: Self, carry: bool) -> (Self, bool) {
// note: longer-term this should be done via an intrinsic, but this has been shown
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says tidy won't allow the PR to be merged with it, though

// to generate optimal code for now, and LLVM doesn't have an equivalent intrinsic
let (a, b) = self.overflowing_add(rhs);
let (c, d) = a.overflowing_add(carry as $SelfT);
(c, b | d)
}

/// Calculates `self` - `rhs`
///
/// Returns a tuple of the subtraction along with a boolean indicating
Expand All @@ -1403,6 +1433,36 @@ macro_rules! uint_impl {
(a as Self, b)
}

/// Calculates `self - rhs - borrow` without the ability to overflow.
///
/// Performs "ternary subtraction" which takes in an extra bit to subtract, and may return
/// an additional bit of overflow. This allows for chaining together multiple subtractions
/// to create "big integers" which represent larger values.
///
/// # Examples
///
/// Basic usage
///
/// ```
/// #![feature(bigint_helper_methods)]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".borrowing_sub(2, false), (3, false));")]
#[doc = concat!("assert_eq!(5", stringify!($SelfT), ".borrowing_sub(2, true), (2, false));")]
#[doc = concat!("assert_eq!(0", stringify!($SelfT), ".borrowing_sub(1, false), (", stringify!($SelfT), "::MAX, true));")]
#[doc = concat!("assert_eq!(0", stringify!($SelfT), ".borrowing_sub(1, true), (", stringify!($SelfT), "::MAX - 1, true));")]
/// ```
#[unstable(feature = "bigint_helper_methods", issue = "85532")]
#[rustc_const_unstable(feature = "const_bigint_helper_methods", issue = "85532")]
#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn borrowing_sub(self, rhs: Self, borrow: bool) -> (Self, bool) {
// note: longer-term this should be done via an intrinsic, but this has been shown
// to generate optimal code for now, and LLVM doesn't have an equivalent intrinsic
let (a, b) = self.overflowing_sub(rhs);
let (c, d) = a.overflowing_sub(borrow as $SelfT);
(c, b | d)
}

/// Calculates the multiplication of `self` and `rhs`.
///
/// Returns a tuple of the multiplication along with a boolean
Expand Down