Make Python hashing / identity traits for `Formula`, `Variable`, `Expression` more strict #8491

EricCousineau-TRI · 2018-04-01T08:52:51Z

Relates #8315

At present, we overload __hash__ and __eq__ - returning Formula, a non-bool value - for Variable and Expression. With the implicit behavior of __nonzero__ for Formula, a non-null Formula will always return true, meaning that some of Python's identity mechanisms will produce unexpected results.

The main issue is that comparison in some unittests may not be fully checked, as unittest's assertEqual for two objects may be comparing by __eq__ and not __hash__, since it will check a == b, and then return (a == b).__nonzero__(), which is not the check that the author would intend.

We could throw on __nonzero__ for Formula; however, that will prevent Variable, Expression, etc. from being used in containers like dict, because, in CPython (Python 2.7.12) it does a comparison with __eq__ after finding the item according to its hash (which I guess is to avoid hash collisions):
https://github.com/python/cpython/blob/1fae982b9b6fff5a987a69856b91339e5d023838/Objects/dictobject.c#L342

Potential solutions:

Disable Formula.__nonzero__, and require users use a wrapping dictionary that has keys which define __eq__ to compare based on the object's hash.
Adopt sympys style: ensure that == and != return boolean value comparison, and have other operators (< <= > >=) return Formulas
Do nothing and try to ensure that we hand curate any existing symbolic tests

(I had tried to see if defining __cmp__ could buy us something, but then re-read the docs which states that rich comparison (__eq__, __lt__, etc.) will take precedence over __cmp__ if they are defined.)

My vote is for (1), as it is relatively simple, and we can massage the interface to pybind11 to make it relatively seamless. Aside from this caveat in dict, I'm not sure of any other situations where this will affect us.
(2) doesn't buy us much of anything w.r.t. #8315, as we still need other logical operators to play well in this framework.
(3) would not catch future errors.

@soonho-tri Can I ask what your opinion is on this, and if there are other solutions that you can think of?

The text was updated successfully, but these errors were encountered:

soonho-tri · 2018-04-02T13:04:48Z

I think (1) is the best among the options. We also want to remove the explicit bool conversion of symbolic::Formula from the C++ side which is analogous to this proposal on the Python side.

EricCousineau-TRI · 2018-04-08T16:23:34Z

Hup, forgot to link issues: Relates #8417

EricCousineau-TRI added the configuration: python label Apr 1, 2018

EricCousineau-TRI self-assigned this Apr 1, 2018

EricCousineau-TRI mentioned this issue Apr 3, 2018

pydrake symbolic: Disable Formula.__nonzero__ #8500

Merged

soonho-tri closed this as completed in #8500 Apr 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Python hashing / identity traits for `Formula`, `Variable`, `Expression` more strict #8491

Make Python hashing / identity traits for `Formula`, `Variable`, `Expression` more strict #8491

EricCousineau-TRI commented Apr 1, 2018 •

edited

Loading

soonho-tri commented Apr 2, 2018

EricCousineau-TRI commented Apr 8, 2018

Make Python hashing / identity traits for Formula, Variable, Expression more strict #8491

Make Python hashing / identity traits for Formula, Variable, Expression more strict #8491

Comments

EricCousineau-TRI commented Apr 1, 2018 • edited Loading

soonho-tri commented Apr 2, 2018

EricCousineau-TRI commented Apr 8, 2018

Make Python hashing / identity traits for `Formula`, `Variable`, `Expression` more strict #8491

Make Python hashing / identity traits for `Formula`, `Variable`, `Expression` more strict #8491

EricCousineau-TRI commented Apr 1, 2018 •

edited

Loading