Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in compound unit dimensionality/unit reduction after caching? #876

Closed
jthielen opened this issue Sep 11, 2019 · 5 comments · Fixed by #877
Closed

Bug in compound unit dimensionality/unit reduction after caching? #876

jthielen opened this issue Sep 11, 2019 · 5 comments · Fixed by #877

Comments

@jthielen
Copy link
Contributor

I came across a bizarre bug that popped up on the master branch recently:

import pint
ureg = pint.UnitRegistry()
print(ureg('joule').to_base_units())
print(ureg('joule * second ** 2 / kilogram / meter').to_base_units())

gives the following traceback

DimensionalityError                       Traceback (most recent call last)
<ipython-input-18-62589ad7e050> in <module>
----> 1 print((1 * ureg.joule * ureg.second ** 2 / ureg.kilogram / ureg.meter).to_base_units())

~/dev/pint/pint/quantity.py in to_base_units(self)
    478         _, other = self._REGISTRY._get_base_units(self._units)
    479 
--> 480         magnitude = self._convert_magnitude_not_inplace(other)
    481 
    482         return self.__class__(magnitude, other)

~/dev/pint/pint/quantity.py in _convert_magnitude_not_inplace(self, other, *contexts, **ctx_kwargs)
    406                 return self._REGISTRY.convert(self._magnitude, self._units, other)
    407 
--> 408         return self._REGISTRY.convert(self._magnitude, self._units, other)
    409 
    410     def _convert_magnitude(self, other, *contexts, **ctx_kwargs):

~/dev/pint/pint/registry.py in convert(self, value, src, dst, inplace)
    707             return value
    708 
--> 709         return self._convert(value, src, dst, inplace)
    710 
    711     def _convert(self, value, src, dst, inplace=False, check_dimensionality=True):

~/dev/pint/pint/registry.py in _convert(self, value, src, dst, inplace)
   1237                 value, src = src._magnitude, src._units
   1238 
-> 1239         return super(ContextRegistry, self)._convert(value, src, dst, inplace)
   1240 
   1241     def _get_compatible_units(self, input_units, group_or_system):

~/dev/pint/pint/registry.py in _convert(self, value, src, dst, inplace)
    990 
    991         if not (src_offset_unit or dst_offset_unit):
--> 992             return super(NonMultiplicativeRegistry, self)._convert(value, src, dst, inplace)
    993 
    994         src_dim = self._get_dimensionality(src)

~/dev/pint/pint/registry.py in _convert(self, value, src, dst, inplace, check_dimensionality)
    729             # then the conversion cannot be performed.
    730             if src_dim != dst_dim:
--> 731                 raise DimensionalityError(src, dst, src_dim, dst_dim)
    732 
    733         # Here src and dst have only multiplicative units left. Thus we can

DimensionalityError: Cannot convert from 'joule * second ** 2 / kilogram / meter' ([length]) to 'dimensionless' (dimensionless)

but it works fine after removing the joule line.

After digging through the commit history, it seems to lead back to a9a97ba, since any point in the history I checked including that commit fails with this error, and any without it returns the expected result of 1.0 meter for the second line. Glancing through the changes made there, it makes sense that it is tied to this seemingly cache-related issue. That being said, I have no clue what is particularly going wrong here or what exactly in a9a97ba could have caused this.

In case it helps with troubleshooting...

...the error no longer arises when any of the compound pieces of the last unit are removed.

...it still arises with the following:

import pint
ureg = pint.UnitRegistry()
print((1 * ureg.joule).to_base_units())
print((1 * ureg.joule * ureg.second ** 2 / ureg.kilogram / ureg.meter).to_base_units())

...but does not arise with the following:

import pint
ureg = pint.UnitRegistry()
print((1 * ureg.joule).to_base_units())
print((1 * ureg.joule * ureg.second ** 2 / ureg.kilogram / ureg.meter).to('m'))

Another oddity is that, despite what the last non-failing example may suggest, this first arose in MetPy through a function that does not use .to_base_units, but rather just .to. However, after spending a decent amount of time on it, I can't seem to come up with a short example that replicates the failure with .to.

ping @crusaderky, in case you may have any insight here with the seemingly problematic commit from #864.

@crusaderky
Copy link
Contributor

I'm looking into it

@crusaderky
Copy link
Contributor

Huh, what the...

After my change, UnitsContainer.__eq__ relies on the output of the hash function, whereas before it didn't.
The problem is that hash(-1) is the same as hash(-2), and my jaw just dropped:

>>> hash(1)
1
>>> hash(2)
2
>>> hash(-1)
-2
>>> hash(-2)
-2

@crusaderky
Copy link
Contributor

So the fact that hash(-1) returns -2 is an implementation detail of CPython, because they use -1 for error codes:
https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Python/pyhash.c#L131-L143
(plus more cases in the same file)

However, this is apparently working as intended. From https://docs.python.org/3/reference/datamodel.html#object.__hash__:

> The only required property is that objects which compare equal have the same hash value

and it turns out that, on dict and set insertion, when two objects have the same hash, their __eq__ methods are called (I didn't know):

class C:
    def __init__(self, x):
        self.x = x

    def __eq__(self, other):
        print(f"{self}.__eq__({other})")
        return self.x == other.x

    def __hash__(self):
        print(f"{self}.__hash__")
        return hash(self.x)

    def __repr__(self):
        return f"C({self.x})"

set([C(-2), C(-1), C(0)])

output:

set([C(-2), C(-1), C(0)])
C(-2).__hash__()
C(-1).__hash__()
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(-2).__eq__(C(-1))
C(0).__hash__()
Out[3]: {C(-1), C(-2), C(0)}

@crusaderky
Copy link
Contributor

Related: https://bugs.python.org/issue38105

@hgrecco
Copy link
Owner

hgrecco commented Sep 11, 2019

This is very weird bug. Thanks for being to fast to report it, track it and fix it.

@bors bors bot closed this as completed in d804d93 Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants