Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer division behavior with negative result is not unified #746

Open
generalmimon opened this issue May 6, 2020 · 8 comments
Open

Integer division behavior with negative result is not unified #746

generalmimon opened this issue May 6, 2020 · 8 comments

Comments

@generalmimon
Copy link
Member

The integer division operation currently doesn't behave the same way in all languages, if the quotient (division result) is a negative number. Here's a summary table (-20 / 12 = -1.666...; 69 / -20 = -3.45):

Language -20 / 12 69 / -20
C++ -1 -3
C# -1 -3
Go -1 -3
Java -1 -3
JavaScript -2 -4
Lua -2 -4
Nim -1 -3
Perl -1 -3
PHP -1 -3
Python -2 -4
Ruby -2 -4
Rust -1 -3

The User Guide says this:

If both operands are integer, result of arithmetic operation is integer, otherwise it is floating point number. For example, that means that 7 / 2 is 3, and 7 / 2.0 is 3.5.

But that says nothing about how the division on integers should behave on negative results. There are two options: true integer division always rounding to the closest lower number (math.floor(a / b)) and a float division, whose result will be converted (truncated) to integer (which means trucating towards 0). The target languages now use both these options, so it would be nice to unify it to ensure same results for different KS targets.

But I'm actually 100% not sure which approach should be used. Probably the one with integer conversion makes more sense, because it expresses the intention better. The KS expression language doesn't want to yield a float when the operands are integers, so it's natural to calculate the result as float and then cast it to integer. That would imply that the behavior of JavaScript, Lua, Python and Ruby is incorrect.

Is that right? Strictly speaking, it's a BC break with 0.8, but as @GreyCat said here that we've already done that, so I guess it's OK?

@dgelessus
Copy link
Contributor

dgelessus commented May 7, 2020

Related: The behavior of x % y (modulo) with a negative x also differs between languages. There are two common behaviors for this (that I know of), and each of them makes the most sense in combination with one of the two integer division behaviors:

  1. The modulo is calculated as if x was positive, then the result is negated. This makes more sense if integer division rounds to zero. This is what (for example) Java does.
  2. The modulo "wraps around" or "counts backwards" so that it is always positive (can't think of a good formal description for this behavior). This makes more sense if integer division rounds down. This is what (for example) Python does.

No matter which integer division behavior we choose, we should also at the same time define the behavior of x % y for negative x, so that division and modulo behave consistently.

To visualize this, here's a table with the integers in the range -6 ≤ i < 6 divided by and modulo 3, with each of the two division and modulo behaviors. Note that trunc division and modulo behavior 1 match nicely, as do floor division and modulo behavior 2, but the other two combinations do not.

Table to compare division and modulo behaviors
i trunc(i / 3) i % 3
behavior 1
floor(i / 3) i % 3
behavior 2
-6 -2 0 -2 0
-5 -1 -2 -2 1
-4 -1 -1 -2 2
-3 -1 0 -1 0
-2 0 -2 -1 1
-1 0 -1 -1 2
0 0 0 0 0
1 0 1 0 1
2 0 2 0 2
3 1 0 1 0
4 1 1 1 1
5 1 2 1 2

@dgelessus
Copy link
Contributor

Personally I prefer the floor division and "wraparound" modulo behavior. When you divide a range of integers by a constant divisor, the results fall nicely into groups that are all the same size (unlike with truncating division, where the group with result 0 is larger than the other groups). I also think it makes more sense to always have modulo return a number in the range 0 ≤ res < y, rather than having it change its sign depending on the input.

Though maybe I'm biased here, because I'm primarily a Python user and this is Python's behavior for integer division and modulo. :)

@GreyCat
Copy link
Member

GreyCat commented May 9, 2020

@dgelessus Actually, % operation in KS already gets special treatment and is always guaranteed to be modulo (and return non-negative result). E.g. -5 % 8 is 3. See expr_mod.ksy.

@generalmimon Totally agree that / seems to need the same treatment as we currently do to % to ensure uniform calculations everywhere, and totally agree that it makes sense to stick to majority behavior, i.e. implementing / wrappers for JavaScript, Lua, Python and Ruby.

We can probably start with expr_div.ksy, akin to expr_mod.ksy, to make the problem obvious.

@dgelessus
Copy link
Contributor

totally agree that it makes sense to stick to majority behavior, i.e. implementing / wrappers for JavaScript, Lua, Python and Ruby.

I don't think that's a good idea if we've already decided that % will always return a non-negative result. As explained in #746 (comment), if you want integer / and % to behave consistently, you have to use either floor division combined with never-negative modulo, or truncating division combined with possibly-negative modulo. Combining truncating division with never-negative modulo doesn't work well (see the table in my above comment).

@generalmimon
Copy link
Member Author

generalmimon commented May 9, 2020

@generalmimon Totally agree that / seems to need the same treatment as we currently do to % to ensure uniform calculations everywhere, and totally agree that it makes sense to stick to majority behavior, i.e. implementing / wrappers for JavaScript, Lua, Python and Ruby.

Looking at the table created by @dgelessus, maybe I'm more inclined to the floor division (always rounding towards negative infinity). The reason is that it plays nicely with the modulo % operator we have, as you can see in the table. @dgelessus actually summed it up nicely:

Note that trunc division and modulo behavior 1 match nicely, as do floor division and modulo behavior 2, but the other two combinations do not.

We're using the modulo behavior 2, so it doesn't really make sense to have truncated integer division (trunc(i / 3)), it would be unnatural and hard to use along with the modulo operation.

@GreyCat Have you some specific reason why prefer the truncated division, other than that it is used by the majority of languages? Again, we don't have to reflect the majority behavior if it doesn't suit for us (this sentence is becoming my mantra 😄). The languages that truncate the integer division have their own reasons for doing so, and they mostly use the modulo behavior 1, so it makes sense for them, but it doesn't for us.

@dgelessus
Copy link
Contributor

By the way @generalmimon I'm curious where you got the division behavior for Lua, JavaScript, Perl and PHP from. In all of these languages, dividing two integers performs a float division instead, so you cannot perform an integer division without explicitly truncating or rounding the result. If you go by the behavior of their modulo operators, JavaScript and PHP say -20 % 12 == -8 (which would imply truncating integer division), and Lua and Perl say -20 % 12 == 4 (which would imply floor integer division).

@generalmimon
Copy link
Member Author

generalmimon commented May 9, 2020

@dgelessus I'm talking about the actual behavior of the KSC-generated code (i.e. the result that users get when they e.g. create a value instance with value: -5 / 3), not how it's implemented in each language. For example, JavaScript indeed doesn't have integer division, so KSC yields Math.floor(-5 / 3). That's why JavaScript falls into the floor division category.

@GreyCat
Copy link
Member

GreyCat commented May 10, 2020

@dgelessus @generalmimon Thanks for explanation! Totally agree that it's better to choose most consistent behavior.

"% returning non-negative" was chosen initially for purposes of various alignment / padding calculations. Having things like size: (some_expression_calculating_padding) % 8 that are guaranteed to be always non-negative seems to simplify things a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants