Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numerical equality with = and == #859

Closed
ikappaki opened this issue Feb 3, 2024 · 3 comments · Fixed by #868
Closed

numerical equality with = and == #859

ikappaki opened this issue Feb 3, 2024 · 3 comments · Fixed by #868
Labels
lang Issue pertaining to Basilisp language modules question Further information is requested

Comments

@ikappaki
Copy link
Contributor

ikappaki commented Feb 3, 2024

Clojure has this concept with numerical equality that = returns true iff its arguments are numerically the same and fall under the same category:

https://clojure.org/guides/equality

Clojure’s = is true when called with two immutable scalar values, if:
    Both arguments are nil, true, false, the same character, or the same string (i.e. the same sequence of characters).
    Both arguments are symbols, or both keywords, with equal namespaces and names.
    Both arguments are numbers in the same 'category', and numerically the same, where category is one of:
        integer or ratio
        floating point (float or double)
        [BigDecimal](https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html).

That means that if we compare an integer or a ratio with the floating point number in Clojure that have the same numericall value, it will return false:

user> (= 1 1/1)
true
user> (= 1 1.0)
false

while in Basilisp

user> (= 1 1/1)
true
user> (= 1 1.0)
true

It appears to me this is a deliberate decision in the Clojure design, rather than a Java runtime decision, for example

jshell> 1 == 1.0
$5 ==> true

Clojure also offers the less restricive == core fn, which soley compares numbers for numerical equality without considering the category (same as basilisp's =)

user> (== 1 1.0)
true

Is the Clojure behavior something we want to support in Basilisp or shall we define == the same as =? I am inclining towards the first, but not sure what the implementation tradeoffs might be.

Thanks

@chrisrink10 chrisrink10 added issue-type:bug Something isn't working lang Issue pertaining to Basilisp language modules and removed issue-type:bug Something isn't working labels Feb 5, 2024
@chrisrink10
Copy link
Member

There have been multiple previous PRs and issues dealing with the issue of equality (see #556 and #585 for more details) and I've always come up with more questions than answers.

I believe there are some areas not mentioned in your issue that are of more direct concern (namely inconsistencies in hashes for sequential collections):

basilisp.user=> (hash ["a" 5 :c])
-4585753059381760820

basilisp.user=> (hash (seq ["a" 5 :c]))
-2291626405611039373

basilisp.user=> (hash '("a" 5 :c))
-2291626405611039373

However, I'm not personally convinced there's any need to match Clojure's behavior with respect to numeric equality specifically. As you pointed out, it was certainly done deliberately rather than unintentionally. That being said, you'll note even ClojureScript does not define equality on numbers in this way.

I think without a clear bug (e.g. some behavior of some other functionality does not work as intended because of the decision compare all numerics for equality across "kinds") I would not like to change this behavior in Basilisp.

@chrisrink10 chrisrink10 added the question Further information is requested label Feb 5, 2024
@ikappaki
Copy link
Contributor Author

ikappaki commented Feb 7, 2024

I think without a clear bug (e.g. some behavior of some other functionality does not work as intended because of the decision compare all numerics for equality across "kinds") I would not like to change this behavior in Basilisp.

There is a test case that I've encountered in our discourse of updating basilisp.core\rem fn, although now that I think of it that was in core_test.py (using python's assert) rather than basilisp's (is (=...), but the point is till valid I believe

The first rem update was to use python's math.remainder operator, which turned out is suitable of doing IEEE754 floating point operations and is always returning a decimal floating types, such as 1.0, which is undesirable for rem. The test asserts the result with integer values, e.g. assert 1 == core.rem(5,2). This in python won't fail, even though we would have wanted it to, becauserem should return an integer not a float, and we are comparing integers with floats here. I remember it struck me a the time while investigating the arithmetic division compatibility with clojure, that this was not caught up.

The second contrived case is the comparison of integer with floating points that are "close" to 1, e.g.

basilisp.user=> (= 1 1.00000000000000009)
true

Mathematically this is not correct, and it is only due to the technical limitations of IEEE floats we consider this by convention to be acceptable. Clojure gives users a choice in this instance by design (it seems).

My only concern going with this approach is how performant the implemention be if we were to redefine = to have the above semantics rather than the using built in = operator.

If we are leaving = as is as per suggestion, I am going to document this difference with Clojure. Also, what should we do with ==, shall I define as an inline/alias of = or leave it undefined as it currently is?

Thanks

@chrisrink10
Copy link
Member

In brief testing this was the cutoff point where equality was no longer the same between the integer 1 and a float with a small decimal value.

>>> 1 == 1.0000000000000001
True
>>> 1 == 1.000000000000001
False

While I'm sure there are cases where this is problematic, the difference seems more academic than practical in the vast majority of cases. I'm sure that's why Rich chose to do it for Clojure. But since it was not adopted for CLJS, there is obviously some tradeoff worth making here.

As you also allude to, the performance impacts to all = comparisons just to support this specific notion of equality for certain numerics will be significant because we are not able to do multiple dispatch based on type.

I'm not saying this is my absolute final decision and that I would never amend it, but at this moment in time it does not feel like a project worth doing.

If we are leaving = as is as per suggestion, I am going to document this difference with Clojure. Also, what should we do with ==, shall I define as an inline/alias of = or leave it undefined as it currently is?

As to this point, I am ok with simply defining == as an alias of = for the time being.

chrisrink10 pushed a commit that referenced this issue Feb 11, 2024
Hi,

could you please consider patch to add `==` as an alias for `=`. Fixes
#859.

I've also updated the doc to describe the difference of int vs float
comparison in basilisp vs Clojure.

Thanks

---------

Co-authored-by: ikappaki <ikappaki@users.noreply.github.com>
@chrisrink10 chrisrink10 added this to the Release v0.1.0 milestone Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang Issue pertaining to Basilisp language modules question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants