Euclidean modulo #2169

varkor · 2017-10-09T22:28:17Z

Proposal to add Euclidean modulo & division functionality for integers and floating-point numbers, to address common issues with taking remainders involving negative numbers.

Internals discussion here.

Rendered.
Tracking issue.

mcarton · 2017-10-10T20:14:41Z

While I like the proposal, I really don't like the _e suffix.

Centril · 2017-10-11T01:54:22Z

@mcarton What part of it? That it has a short suffix, or that it has a suffix and not a prefix? Both?

Centril · 2017-10-11T01:55:47Z

How common is usage of these operations? Would it warrant adding another operator to the language?

varkor · 2017-10-11T12:25:54Z

@Centril: I couldn't think of a good way to measure the use of these operations, as they can be implemented in various different ways. The rationale behind advocating new methods, rather than new operators, was that we could see how often they were used in Rust (say, 12 months down the line), to see whether they were used common enough to be worth considering as operators.

Centril · 2017-10-11T13:03:27Z

@varkor this makes sense to me. However, many users will shy away from functionality until it has been stabilized, thus usage metrics for unstable stuff might not be fully representative.

mcarton · 2017-10-11T21:19:40Z

@mcarton What part of it? That it has a short suffix, or that it has a suffix and not a prefix? Both?

The shortness.

scottmcm · 2017-10-11T21:24:06Z

If the output of these are always positive, should they return unsigned types instead?

Similarly, it feels like something here should enable (-1i8).modulo(200u8) => 199u8.

varkor · 2017-10-11T23:00:57Z

@scottmcm: I followed the precedent set by the abs method for returning an integer of the same type, rather than the unsigned variant, as I imagined there was previous rationale for this decision.

Regarding inter-type modulo, the mod_e method as specified in the proposal behaves similarly to the built-in % operator, which also does not allow operations like -1i8 % 200u8. Again, the focus here was on consistency: it feels like this issue is something better addressed by a separate discussion (e.g. the internals thread on implicit widening).

est31 · 2017-10-12T17:27:00Z

As someone who has hacked euclidean modulo as ((a % b) + b) % b previously (that's branchless 😎 ; proof they are the same at least for positive b ), I can only support this RFC! I in fact implemented the variant for floats, so it'd be great if the RFC could include them :)

DerSaidin · 2017-10-18T12:38:34Z

Alternative name possibilities?

mod_e
emod
e_mod
euc_mod
eucmod

tspiteri · 2017-10-18T12:42:20Z

Maybe floor_div and floor_rem can be used: whereas div and rem truncate the quotient towards zero, the floor_ counterparts round the quotient down.

kennytm · 2017-10-18T12:51:50Z

text/0000-euclidean-modulo.md

+// Comparison of the behaviour of Rust's truncating division
+// and remainder, vs Euclidean division & modulo.
+(-8 / 3,      -8 % 3)       // (-2, -2)
+(-8.div_e(3), -8.mod_e(3))  // (-3,  1)


. has higher precedence than unary minus. This will be evaluated as -( 8.mod_e(3) ) i.e. -2. You want (-8).mod_e(3).

Good catch, thanks!

varkor · 2017-10-18T18:19:00Z

@est31: I ran some benchmarks before making the RFC comparing the implementation for Euclidean modulo as suggested in the RFC with the branchless variant, and the RFC version turned out to be ~5x faster. It could be that the branch prediction is making the benchmarks nonrepresentative, so if you have any results to the contrary, please do share them!

@tspiteri: Flooring division and modulo is subtly different (see this paper for more details — essentially Euclidean modulo always returns a nonnegative value), so avoiding the terminology floor would be preferable, to avoid confusion.

varkor · 2017-10-18T18:19:16Z

Regarding the name: I'm happy with a change of name, provided it doesn't become cumbersome — I'd personally prefer the operation name to come first for aiding discoverability (for example, if other methods of rounding are added in the future) via autocomplete, etc. Perhaps div_euc and mod_euc strike the right balance between descriptiveness and succinctness?

est31 · 2017-10-18T18:32:54Z

@varkor I haven't run any benchmarks and was just guessing that a branchless version is faster. The code path is definitely not hot so I didn't care much.

tspiteri · 2017-10-18T19:10:26Z

@varkor Yes, flooring and Euclidean division are different; to use floor_div and floor_rem flooring division would of course need to be used. I much prefer flooring over Euclidean; in Wikipedia's list of modulo operators in various programming languages, flooring division is supported by 71 languages including Common Lisp, Clojure, Haskell, Java, Mathematica, MATLAB, Python and Ruby, while Euclidean division is supported by 8 languages including Maple and Scheme. Also, for example the GMP bignum library supports truncating and flooring division, while not supporting Euclidean division. The paper you linked to makes some arguments for Euclidean over flooring, but I think the arguments are simply a matter of taste, while the similarity with other software is a concrete advantage of flooring over Euclidean.

tspiteri · 2017-10-18T20:38:09Z

@est31 Your version can overflow. (50i8 % 100i8) + 100i8 overflows.

fanzier · 2017-10-18T23:13:34Z

@tspiteri I agree that it makes sense to consider what other languages do but it shouldn't be the only argument -- Rust does a lot of things better/differently than other languages. I don't see how differences to other software would be a disadvantage for Rust's modulo function, given that the modulo operator % is already the same as in C, Java etc.

You can also view the bandwagon argument from the opposite side: In mathematics, Euclidean modulo is standard that everyone uses, presumably because it has the most regular properties. I also don't think the arguments in the paper are a matter of taste: the code examples really are more uniform if Euclidean division is used. That is anecdotal evidence but I don't think we have any better kind of evidence.

That said, I don't care very much whether it's flooring or Euclidean modulo. I just think the discussion should include more than personal preference and the bandwagon argument. (Also, I'm kind of afraid this proposal will die because of bikeshedding about this.)

scottmcm · 2017-10-19T00:52:49Z

In mathematics, Euclidean modulo is standard that everyone uses

Can you elaborate on usage of negative-divisor modulo in mathematics? I'm only familiar with strictly-positive ones, and for that flooring and euclidean are the same.

fstirlitz · 2017-10-19T20:29:35Z

Rambling a bit here:

Parroting mathematical usage can just as well be considered 'the bandwagon argument': π is the standard that everyone uses, mostly for the sake of tradition, even though τ has better mathematical properties.

Euclidean division is defined the way it is because it aligns well with Euclid's division theorem for integers (which requires the remainder to satisfy 0 ≤ r < |d|). This theorem in turn is formulated this way presumably because this is the most concise way to make the division result unique.

But this property cannot be maintained in general Euclidean domains (the 'proper' structure to study division-with-remainder), where a total order relation may fail to exist, nor there may be any way to make the result 'naturally' unique (Gaussian integers, polynomials). From this more general perspective, floor division is just as good a definition of division-with-remainder as the 'Euclidean' division. I see no properly mathematical reason to prefer one over the other.

And I don't recall any programming problem where a division-with-remainder by a negative divisor is meaningful, so from this point of view, there also seems to be no reason to prefer either. (Well, I think I can imagine a problem where the choice of definition may conceivably matter, but I still don't see how either is more advantageous).

That perspective, however, would suggest also adding a ceiling division operator (it's the answer to the question 'how many buckets of capacity n do you need to contain N items?') and a combined division-modulo operation (which answers the problem 'given a scalar offset into a two-dimensional array, what are its corresponding two-dimensional cordinates?'). Admittedly the former can be expressed with (N + n - 1) / n, but that behaves badly in the presence of overflow; while the latter can be probably taken care of by the optimiser, but you may want not to rely on that, and it's still neater to write let (y, x) = offset.divmod(stride) than to compute the quotient and remainder separately.

tmccombs · 2017-10-21T16:50:13Z

+1 for a combined division-modulo operation.

Rename `div_e` and `mod_e` to `div_euc` and `mod_euc`, respectively. This is more descriptive whilst still being relatively concise.

varkor · 2017-10-23T20:44:51Z

I've updated the method names in the proposal with a more descriptive _euc suffix, which is clearer, whilst still remaining relatively concise.

Additionally, I've extended the RFC slightly to include Euclidean division and modulo methods for f32 and f64, as there was support in this thread for a complete implementation (and the floating-point exclusion did seem like a gap in the RFC before).

Regarding the discussion about flooring versus Euclidean division/modulo that arose: I think @fstirlitz summed it up well; in practice, there will very rarely be a difference between the results of flooring and Euclidean modulo/division — taking the negative modulo of a number is a very uncommon operation: it's just a matter of deciding what the behaviour in this particular edge case should be. My feeling was, and the general feeling on the internals thread seemed to be, that Euclidean modulo was the least surprising of the two (and mathematical consistency is a bonus).

I'd be hesitant to add a combined division-modulo method in this RFC — it seems like a separate, though tangentially related, issue — and it'd be better not to crowd this RFC with new methods.

varkor · 2018-02-13T17:16:24Z

Following on from @est31's point: I don't see the advantage to spelling the name euclidean out fully (when other methods like mul/div, etc. are shortened). Anyone who knows what Euclideon modulo is will know what euc stands for. On the other hand, anyone who doesn't know won't be enlightened by reading the full name — they'll likely just head for the docs, which is equally elucidating for any name. Considering that we expect mod_euc to be reasonably common (especially in particular domains like mathematics or game development), making it significantly longer seems disadvantageous.

I'd like to know what other people think about changing the return type before making any changes, though. It'd be really nice to finalise this RFC at last.

tspiteri · 2018-02-13T17:40:25Z

About the return type, it depends on whether this is meant to look like the other arithmetic operators, or like other inherent functions. For operators on primitives I would expect the output to be like self, whereas for a non-operator-like inherent function a different return type is not something new, see for example i32::count_ones.

tspiteri · 2018-02-13T18:30:14Z

And another thing, returning unsigned values suggests to me that this is only intended for indexing, not as a normal division operation where operating on signed integers should return a signed integer. But in that case it would make more sense to call it something like cyclic_index.

varkor · 2018-02-13T20:41:45Z

@tspiteri makes a good point. I think that considering div_euc will return signed values, mod_euc should have the same type (like an operator) — because otherwise the identity x = n * x.div_euc(n) + x.mod_euc(n) can't be expressed without casting, which just seems wrong, as the two functions are intrinsically linked.

It's unfortunate that the signature will mean casting is required in some cases, but the consistency seems more important here.

varkor · 2018-02-21T12:53:36Z

@sfackler: is it possible to follow up regarding the naming convention — taking into account the points others have made, is there strong motivation for requiring euclidean over euc? I feel most people would prefer shorter names (given that they're unambiguous) over longer ones, for common operations.

After that has been resolved, are there any other issues stalling this RFC from a FCP?

sfackler · 2018-02-27T18:02:05Z

@est31 something something foolish consistency something something :P

Yeah, we can figure out the naming as part of stabilization.

@rfcbot fcp merge

rfcbot · 2018-02-27T18:02:06Z

Team member @sfackler has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot · 2018-02-28T05:11:21Z

🔔 This is now entering its final comment period, as per the review above. 🔔

scottmcm · 2018-02-28T08:24:57Z

because otherwise the identity x = n * x.div_euc(n) + x.mod_euc(n) can't be expressed without casting

But it's also adding new methods, so maybe that just means the set of added methods isn't sufficient yet. Conveniently, there's even naming and semantic precedent for it in f32::mul_add.

You can have x.div_mod(n) == (a, b) → a.mul_add(n, b) == x. And those two methods are exactly the ones you'd need for @fstirlitz's coordinate sectoring use case above.

(Their types being div_mod : (iN, uN) -> (iN, uN) and mul_add : (iN, uN, uN) -> iN)

varkor · 2018-02-28T11:57:10Z

(Their types being div_mod : (iN, uN) -> (iN, uN) and mul_add : (iN, uN, uN) -> iN))

This signature (and similarly if mod_euc were modified in the same way) is hardly better than the current signature: you can't index with a uN unless N is size, so you'd still have to cast in every other case. (And one can't just return usizes, because this may be smaller than the input types.) On top of that, it comes at the cost of consistency.

On the other hand, the current signature is very amenable to modification in the future when implicit widening is introduced, which should eliminate the problems entirely.

(Additionally, for consistency purposes, such a signature for i/uN::mul_add is undesirable: the current signature is mul_add: (fN, fN, fN) -> fN) and having a different signature for integers seems prone to confusion.)

varkor · 2018-02-28T11:59:52Z

Regarding div_mod specifically: I think there is good reason to add such a method (or family of methods), but think it's outside the scope of this RFC; it's been long enough that I'd rather start with what is currently proposed, and introduce a combined method later (as though it's related, it's orthogonal to the motivations here). (Maybe such a method would even be minor enough to forgo the RFC process?)

bstrie · 2018-03-08T02:00:33Z

Hold up, why is rfcbot only listing four members of the libs team? The libs team has eight people on it.

sfackler · 2018-03-08T05:27:34Z

We subdivided the libs team a bit.

rfcbot · 2018-03-10T05:21:20Z

The final comment period is now complete.

alexcrichton · 2018-03-15T15:26:53Z

Alright! FCP has now elapsed and it looks like there were no major things brought up, so I'm going to merge!

Tracking issue

Centril · 2018-03-16T00:22:46Z

Updated Rendered link + added tracking issue to top post.

varkor added 4 commits October 9, 2017 20:00

Add an initial proposal for Euclidean modulo

e84b3b9

Improve the readability of the proposal

b287ad9

Consider floating-point implementations

1b3bf49

Fix link

ffb409d

scottmcm added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 10, 2017

kennytm reviewed Oct 18, 2017

View reviewed changes

Fixed an operator precedence issue

7e3bc84

varkor added 2 commits October 23, 2017 21:10

Change the name of the Euclidean division/modulo operators

cb8bf0a

Rename `div_e` and `mod_e` to `div_euc` and `mod_euc`, respectively. This is more descriptive whilst still being relatively concise.

Add floating-point Euclidean division and modulo

1650689

rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Feb 27, 2018

rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Feb 28, 2018

alexcrichton mentioned this pull request Mar 15, 2018

Tracking issue for RFC 2169: Euclidean Modulo rust-lang/rust#49048

Closed

2 tasks

RFC 2169: Euclidean Modulo

0559673

alexcrichton merged commit dfad2c9 into rust-lang:master Mar 15, 2018

varkor deleted the euclidean-modulo branch March 15, 2018 15:43

clamydo mentioned this pull request May 1, 2018

Document round-off error in .mod_euc()-method, see issue #50179 rust-lang/rust#50342

Merged

scottmcm mentioned this pull request May 17, 2018

Add compensated_add for floats rust-lang/rust#50774

Closed

fstirlitz mentioned this pull request Jul 28, 2018

Conversions: FromLossy and TryFromLossy traits #2484

Open

Centril added the A-arithmetic Arithmetic related proposals & ideas label Nov 23, 2018

est31 mentioned this pull request Jul 27, 2019

Stablize Euclidean Modulo (feature euclidean_division) rust-lang/rust#61884

Merged

fstirlitz mentioned this pull request Apr 23, 2022

Tracking Issue for int_roundings rust-lang/rust#88581

Open

laurmaedje mentioned this pull request Nov 15, 2023

Implement euclidean division and remainder typst/typst#2678

Merged

Euclidean modulo #2169

Euclidean modulo #2169

Conversation

varkor commented Oct 9, 2017 • edited by Centril Loading

mcarton commented Oct 10, 2017 • edited Loading

Centril commented Oct 11, 2017

Centril commented Oct 11, 2017

varkor commented Oct 11, 2017

Centril commented Oct 11, 2017

mcarton commented Oct 11, 2017

scottmcm commented Oct 11, 2017

varkor commented Oct 11, 2017

est31 commented Oct 12, 2017 • edited Loading

DerSaidin commented Oct 18, 2017

tspiteri commented Oct 18, 2017

kennytm Oct 18, 2017

Choose a reason for hiding this comment

varkor Oct 18, 2017

Choose a reason for hiding this comment

varkor commented Oct 18, 2017

varkor commented Oct 18, 2017 • edited Loading

est31 commented Oct 18, 2017

tspiteri commented Oct 18, 2017

tspiteri commented Oct 18, 2017

fanzier commented Oct 18, 2017

scottmcm commented Oct 19, 2017

fstirlitz commented Oct 19, 2017

tmccombs commented Oct 21, 2017

varkor commented Oct 23, 2017

varkor commented Feb 13, 2018

tspiteri commented Feb 13, 2018 • edited Loading

tspiteri commented Feb 13, 2018

varkor commented Feb 13, 2018

varkor commented Feb 21, 2018

sfackler commented Feb 27, 2018

rfcbot commented Feb 27, 2018 • edited by dtolnay Loading

rfcbot commented Feb 28, 2018

scottmcm commented Feb 28, 2018

varkor commented Feb 28, 2018

varkor commented Feb 28, 2018

bstrie commented Mar 8, 2018

sfackler commented Mar 8, 2018

rfcbot commented Mar 10, 2018

alexcrichton commented Mar 15, 2018

Centril commented Mar 16, 2018

varkor commented Oct 9, 2017 •

edited by Centril

Loading

mcarton commented Oct 10, 2017 •

edited

Loading

est31 commented Oct 12, 2017 •

edited

Loading

varkor commented Oct 18, 2017 •

edited

Loading

tspiteri commented Feb 13, 2018 •

edited

Loading

rfcbot commented Feb 27, 2018 •

edited by dtolnay

Loading