Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Python hashing / identity traits for Formula, Variable, Expression more strict #8491

Closed
EricCousineau-TRI opened this issue Apr 1, 2018 · 2 comments
Assignees

Comments

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Apr 1, 2018

Relates #8315

At present, we overload __hash__ and __eq__ - returning Formula, a non-bool value - for Variable and Expression. With the implicit behavior of __nonzero__ for Formula, a non-null Formula will always return true, meaning that some of Python's identity mechanisms will produce unexpected results.

The main issue is that comparison in some unittests may not be fully checked, as unittest's assertEqual for two objects may be comparing by __eq__ and not __hash__, since it will check a == b, and then return (a == b).__nonzero__(), which is not the check that the author would intend.

We could throw on __nonzero__ for Formula; however, that will prevent Variable, Expression, etc. from being used in containers like dict, because, in CPython (Python 2.7.12) it does a comparison with __eq__ after finding the item according to its hash (which I guess is to avoid hash collisions):
https://github.com/python/cpython/blob/1fae982b9b6fff5a987a69856b91339e5d023838/Objects/dictobject.c#L342

Potential solutions:

  1. Disable Formula.__nonzero__, and require users use a wrapping dictionary that has keys which define __eq__ to compare based on the object's hash.
  2. Adopt sympys style: ensure that == and != return boolean value comparison, and have other operators (< <= > >=) return Formulas
  3. Do nothing and try to ensure that we hand curate any existing symbolic tests

(I had tried to see if defining __cmp__ could buy us something, but then re-read the docs which states that rich comparison (__eq__, __lt__, etc.) will take precedence over __cmp__ if they are defined.)

My vote is for (1), as it is relatively simple, and we can massage the interface to pybind11 to make it relatively seamless. Aside from this caveat in dict, I'm not sure of any other situations where this will affect us.
(2) doesn't buy us much of anything w.r.t. #8315, as we still need other logical operators to play well in this framework.
(3) would not catch future errors.

@soonho-tri Can I ask what your opinion is on this, and if there are other solutions that you can think of?

@soonho-tri
Copy link
Member

I think (1) is the best among the options. We also want to remove the explicit bool conversion of symbolic::Formula from the C++ side which is analogous to this proposal on the Python side.

@EricCousineau-TRI
Copy link
Contributor Author

Hup, forgot to link issues: Relates #8417

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants