Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add carrying_add, borrowing_sub, widening_mul, carrying_mul methods to integers #85017

Merged
merged 1 commit into from
Aug 31, 2021

Conversation

clarfonthey
Copy link
Contributor

@clarfonthey clarfonthey commented May 7, 2021

(Tracking issue: #85532)

This comes in part from my own attempts to make (crude) big integer implementations, and also due to the stalled discussion in RFC 2417. My understanding is that changes like these are best offered directly as code and then an RFC can be opened if there needs to be more discussion before stabilisation. Since all of these methods are unstable from the start, I figured I might as well offer them now.

I tried looking into intrinsics, messed around with a few different implementations, and ultimately concluded that these are "good enough" implementations for now to at least put up some code and maybe start bikeshedding on a proper API for these.

For the carrying_add and borrowing_sub, I tried looking into potential architecture-specific code and realised that even using the LLVM intrinsics for addcarry and subborrow on x86 specifically, I was getting exactly the same assembly as the naive implementation using overflowing_add and overflowing_sub, although the LLVM IR did differ because of the architecture-specific code. Longer-term I think that they would be best suited to specific intrinsics as that would make optimisations easier (instructions like add-carry tend to use implicit flags, and thus can only be optimised if they're done one-after-another, and thus it would make the most sense to have compact intrinsics that can be merged together easily).

For widening_mul and carrying_mul, for now at least, I simply cast to the larger type and perform arithmetic that way, since we currently have no intrinsic that would work better for 128-bit integers. In the future, I also think that some form of intrinsic would work best to cover that case, but for now at least, I think that they're "good enough" for now.

The main reasoning for offering these directly to the standard library even though they're relatively niche optimisations is to help ensure that the code generated for them is optimal. Plus, these operations alone aren't enough to create big integer implementations, although they could help simplify the code required to do so and make it a bit more accessible for the average implementor.

That said, I 100% understand if any or all of these methods are not desired simply because of how niche they are. Up to you. 🤷🏻

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 7, 2021
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented May 7, 2021

cc @rust-lang/wg-const-eval some new inherent const fns on integer types. The impl for them already exists (https://github.com/rust-lang/rust/blob/master/compiler/rustc_mir/src/interpret/intrinsics.rs#L274-L280).

@clarfonthey could you split out the constification into a separate commit and add tests that run these functions during the evaluation of a constant or static item?

@leonardo-m
Copy link

these operations alone aren't enough to create big integer implementations,

What's missing?

($SelfT:ty, $WideT:ty, $BITS:literal) => {
/// Calculates the full multiplication of `self` and `rhs`.
///
/// This returns the low-order (insignificant) bits and the high-order (significant) bits

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "insignificant" the right word?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, honestly-- open to suggestions. My main goal was to get an implementation out there and to work out the specifics on stuff like this after merging, though, since it has to exist in order for people to use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the terminology "least significant" and "most significant" bits? Wikipedia.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct -- I don't know why I thought insignificant was the right term.

@leonardo-m
Copy link

Another u64 unsafe method (here it uses this for self) I find useful for related purposes:

#![feature(asm)]
#[allow(unused_assignments, unused_variables)]
#[cfg(any(target_arch = "x86_64"))]
pub unsafe fn mul_mod(this: u64, b: u64, m: u64) -> u64 {
    let mut q: u64;
    let mut r: u64; // r = a b mod m

    asm!(
        "mul {b}; div {m};",
        b = in(reg) b, m = in(reg) m,
        inout("rax") this => q, out("rdx") r,
        options(nostack, nomem)
    );
    r
}

@tspiteri
Copy link
Contributor

tspiteri commented May 7, 2021

I am very wary of using the "carry" terminology for signed integers. Usually for two's complement carry is basically what I would use for an adder when I'm thinking about unsigned only.

For example for something like a < b, the x86 assembly is different for unsigned and signed.

  • For unsigned, a < b evaluates to C where C is the carry flag. That is if a - b generates a carry, then a < b. For example 0 - 1 sets C.
  • But for signed, a < b evaluates to O != S, where O is the overflow flag and S is the sign flag. That is, a < b if either
    • a - b sets the overflow flag and clears the sign flag, for example -2 - MAX sets O and clears S; or
    • a - b clears the overflow flag and sets the sign flag, for example 0 - 1 clears O and sets S.

@nagisa nagisa added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 7, 2021
Comment on lines +114 to +122
pub const fn widening_mul(self, rhs: Self) -> (Self, Self) {
// note: longer-term this should be done via an intrinsic,
// but for now we can deal without an impl for u128/i128
// SAFETY: overflow will be contained within the wider types
let wide = unsafe { (self as $WideT).unchecked_mul(rhs as $WideT) };
(wide as $SelfT, (wide >> $BITS) as $SelfT)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe widening_mul on signed integers should return the less-significant half as unsigned instead. For example consider i8, and −128 × 3 (0x80 * 0x03). Now, when widened to i16 (0xff80 * 0x0003), this gives −384 (0xfe80), which is then returned as (0x80, 0xfe). Now, 0xfe itself is −2, so think of this as meaning −512 because it is the higher word. The correct result is −512 + 128, so the 0x80 should be interpreted as +128 not −128, so it would be u8, not i8.

Note: I'm using unsigned for all hex values to not make it more confusing, even though technically 0xfe_i8 is an error and would be written as -0x02_i8.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, using unsigned for the lower half would make it possible to write something like real_result = (i16::from(high) << 8) | i16::from(low), or + instead of |, both work. With signed for the lower half, i16::from(low) would be sign extended and would actually need to be masked with something like (i16::from(low) & i16::from(u8::MAX)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add, these kinds of suggestions are exactly the kind that I would love to see discussed, and I would definitely suggest bringing it up in the original RFC for widening_mul. Right now I'm kind of inclined to keep this particular method as-is until it's merged and this discussion can happen before stabilisation, since the main goal is to get something out there for people to use at the moment.

I am fine with making this change before stabilisation though if anyone else would like to +1 the idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FWIW, my justification for using the same type is because any big-integer implementation that uses these would be adding one half or the other to some part of the bigint anyway, and using a separate unsigned type for that would just require unnecessary casting. I could easily be swayed, though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked into the internal representation of two bigint implementations: GMP and num-bigint. Both store the digits as unsigned and use an extra flag to indicate whether the big number is positive or negative, so I don't know how an actual bignum implementation that stores digits as signed would use these, if there is such a bignum implementation.

Copy link
Contributor Author

@clarfonthey clarfonthey May 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did notice that most bigint implementations do use a sign flag instead of two's complement. However, in my case, I was working on an implementation of a variable-length integer encoding where integers would still be stored in two's complement as it would make sign extension to get a larger, single unit (e.g. u64 or u128) easier. It probably does make sense for most bigint implementations to use a sign flag and unsigned int units, but that doesn't mean that two's complement bigints are completely out of the question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the ordering of the return tuple: I guess it makes sense to be consistent with other methods who's carry is the last element of the return tuple, and as such have the most significant word on the right. Personally, I prefer the most significant word on the left; then one can write nice idiomatic code e.g. to compare x * y with a * b with x.widening_mul(y).cmp(&a.widening_mul(b)).

Copy link
Member

@scottmcm scottmcm Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely think that add_with_carry matching the order of overflowing_add is the most important part.

But interestingly, the least-significant word on the left has a nice interpretation on (currently more common) little-endian machines: the return is just the number for wide_mul (assuming the obvious, though unspecified, tuple layout): https://rust.godbolt.org/z/sG9E4G8vY

(Though apparently we no longer encode (u32, u32) as i64 in LLVM, which is why the Godbolt demo uses an old rustc version, so maybe that matters less these days.)

@scottmcm
Copy link
Member

scottmcm commented May 8, 2021

I think it would make sense to have methods for this stuff -- the correct incantations today requires knowing the patterns that LLVM accepts, which is knowledge that shouldn't be needed and cranelift might later need something different anyway.

Specifically which signatures and names should be picked I'm unsure. I do think the versions with carries are important, though -- I have an example in rust-lang/rfcs#2417 (comment) of how the most obvious use of wide multiplication is awkward without it. (And it would probably be fine for that to be the only version -- optimizing away a constant-zero carry ought to be completely trivial for any backend. Basically it just means providing a full adder, rather than giving them only half adders and requiring that they figure out how to combine them efficiently.)

@clarfonthey
Copy link
Contributor Author

cc @rust-lang/wg-const-eval some new inherent const fns on integer types. The impl for them already exists (https://github.com/rust-lang/rust/blob/master/compiler/rustc_mir/src/interpret/intrinsics.rs#L274-L280).

@clarfonthey could you split out the constification into a separate commit and add tests that run these functions during the evaluation of a constant or static item?

In hindsight I should have separated these out anyway, so I put them in a separate PR that can be merged before this one. We can move the discussion into #85096.

these operations alone aren't enough to create big integer implementations,

What's missing?

A lot, actually. While the carrying add/sub implementations do make it relatively trivial to implement addition, subtraction, and negation for big integers, multiplication requires either a naïve O(N²) algorithm or other algorithms that do the multiplication properly. These are very critical building-blocks, but alone, they aren't a bigint implementation, only the biggest piece that depends heavily on how it's compiled for performance.

Another u64 unsafe method (here it uses this for self) I find useful for related purposes:

#![feature(asm)]
#[allow(unused_assignments, unused_variables)]
#[cfg(any(target_arch = "x86_64"))]
pub unsafe fn mul_mod(this: u64, b: u64, m: u64) -> u64 {
    let mut q: u64;
    let mut r: u64; // r = a b mod m

    asm!(
        "mul {b}; div {m};",
        b = in(reg) b, m = in(reg) m,
        inout("rax") this => q, out("rdx") r,
        options(nostack, nomem)
    );
    r
}

I definitely agree that modular-arithmetic methods would be useful, but I find that kind of out-of-scope for this particular change, and I'm also not sure how much the assembly is needed for performance.

I think it would make sense to have methods for this stuff -- the correct incantations today requires knowing the patterns that LLVM accepts, which is knowledge that shouldn't be needed and cranelift might later need something different anyway.

Specifically which signatures and names should be picked I'm unsure. I do think the versions with carries are important, though -- I have an example in rust-lang/rfcs#2417 (comment) of how the most obvious use of wide multiplication is awkward without it. (And it would probably be fine for that to be the only version -- optimizing away a constant-zero carry ought to be completely trivial for any backend. Basically it just means providing a full adder, rather than giving them only half adders and requiring that they figure out how to combine them efficiently.)

100% agree with you there; I'm mostly adding the widening_mul impl for completeness and parity with the existing wrapping methods. I was actually working on making a crate for these kinds of methods, but after toying around with stuff, realised that they definitely would be reasonable to include in the standard library.

@clarfonthey
Copy link
Contributor Author

I am very wary of using the "carry" terminology for signed integers. Usually for two's complement carry is basically what I would use for an adder when I'm thinking about unsigned only.

For example for something like a < b, the x86 assembly is different for unsigned and signed.

* For unsigned, `a < b` evaluates to `C` where `C` is the carry flag. That is if `a - b` generates a carry, then `a < b`. For example `0 - 1` sets `C`.

* But for signed, `a < b` evaluates to `O != S`, where `O` is the overflow flag and `S` is the sign flag. That is, `a < b` if either
  
  * `a - b` sets the overflow flag and clears the sign flag, for example `-2 - MAX` sets `O` and clears `S`; or
  * `a - b` clears the overflow flag and sets the sign flag, for example `0 - 1` clears `O` and sets `S`.

Do you have an alternative term you think might be more accurate? Two's complement is still relevant for big integers even if it's not always used, and I think these methods would still be useful in that context.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@crlf0710 crlf0710 added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 26, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 12, 2021
@bors
Copy link
Contributor

bors commented Jul 28, 2021

☔ The latest upstream changes (presumably #87540) made this pull request unmergeable. Please resolve the merge conflicts.

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2021
@JohnCSimon
Copy link
Member

Ping from triage:
@clarfonthey - Can you please resolve the merge conflicts?

@clarfonthey
Copy link
Contributor Author

Will try to remember to do this this evening.

@clarfonthey
Copy link
Contributor Author

@rustbot label -S-waiting-on-author +S-waiting-on-review

Rebased, tests pass locally so it should be good on CI too.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 18, 2021
@clarfonthey
Copy link
Contributor Author

ping @m-ou-se, this PR has been hanging for a while and I'm not sure if it slid out of your radar, or if it might be better for someone else to review it. :)

@m-ou-se
Copy link
Member

m-ou-se commented Aug 30, 2021

Sorry! This looks good now.

Let's try them out and discuss on the tracking issue to see if we can resolve the questions there. :)

@bors r+

@bors
Copy link
Contributor

bors commented Aug 30, 2021

📌 Commit cc15047 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 30, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 31, 2021
Rollup of 10 pull requests

Successful merges:

 - rust-lang#85017 (Add carrying_add, borrowing_sub, widening_mul, carrying_mul methods to integers)
 - rust-lang#86362 (Avoid cloning LocalDecls)
 - rust-lang#88391 (Fix json tuple struct enum variant )
 - rust-lang#88399 (Disallow the aapcs CC on Aarch64)
 - rust-lang#88418 (Allow `~const` bounds on trait assoc functions)
 - rust-lang#88445 (Clean up the lowering of AST items)
 - rust-lang#88495 (Add `TcpStream::set_linger` and `TcpStream::linger`)
 - rust-lang#88501 (Use right span in prelude collision suggestions with macros. )
 - rust-lang#88504 (Keep turbofish in prelude collision lint.)
 - rust-lang#88524 (Remove unnecessary `mut` from udp doctests)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit e7a247d into rust-lang:master Aug 31, 2021
@rustbot rustbot added this to the 1.56.0 milestone Aug 31, 2021
@@ -1307,6 +1307,33 @@ macro_rules! int_impl {
(a as Self, b)
}

/// Calculates `self + rhs + carry` without the ability to overflow.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ability to overflow" sounds like overflow is desirable and the lack of overflow is a limitation of this method. Is that what you meant?

Copy link
Contributor

@AaronKutch AaronKutch Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say

Extended summation of `self + rhs + carry`. The booleans are interpreted as a single bit
integer of value 0 or 1. If unsigned overflow occurs, then the boolean in the tuple returns 1. This
output carry can be chained into the input carry of another carrying add, which allows
for arbitrarily large additions to be calculated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: Since this PR is merged, this discussion should go in the tracking issue (#85532).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait pardon me this has already been merged. I will take discussion elsewhere or open my own PR.

#[must_use = "this returns the result of the operation, \
without modifying the original"]
#[inline]
pub const fn borrowing_sub(self, rhs: Self, borrow: bool) -> (Self, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To a Rustacean, "borrowing" sounds like it has to do with references. So I am very confused by the name of this function. Maybe the docs should explain it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word borrow here comes from the terminology for an full subtractor. I am thinking that maybe the borrowing_sub function should be removed altogether. The same effect that borrowing_sub has can be obtained from carrying_add by making the first carrying_add in the chain have a set carry bit, and then bitnot every rhs. This fact could be put in the documentation of carrying_add.

@clarfonthey clarfonthey deleted the carrying_widening branch September 11, 2021 01:52
@anisse
Copy link

anisse commented Nov 3, 2021

Hi,

I'm still not sure I understand this fully, but I have this test failing with the current code:

assert_eq!(i8::MAX.carrying_add(i8::MIN, true), (0, false));

Playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a161cf42872e087303d1435648e6eceb

Is the test correct ? A carry did happen in an intermediate calculation, but should the overall calculation report a carry ?

@clarfonthey
Copy link
Contributor Author

So, the carrying behaviour for signed integers is a bit disputed; this should definitely go in the tracking issue.

@scottmcm
Copy link
Member

scottmcm commented Nov 3, 2021

@anisse A closed PR isn't a good place to have a conversation about stuff -- most people won't look at them -- I would suggest you open a new issue as a place to have discussion about that particular question (ccing the tracking issue #85532 from it).

@anisse
Copy link

anisse commented Nov 3, 2021

@scottmcm thanks, I'll open a new issue and cc the tracking issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.