Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numba's njit doesn't support non-default behaviors #2762

Closed
agoose77 opened this issue Oct 19, 2023 · 4 comments · Fixed by #2770
Closed

Numba's njit doesn't support non-default behaviors #2762

agoose77 opened this issue Oct 19, 2023 · 4 comments · Fixed by #2770
Assignees
Labels
bug The problem described is something that must be fixed

Comments

@agoose77
Copy link
Collaborator

agoose77 commented Oct 19, 2023

Version of Awkward Array

main

Description and code to reproduce

import numba
import awkward as ak

SOME_ATTRS = {"FOO": "BAR"}
builder = ak.ArrayBuilder(behavior=SOME_ATTRS)


@numba.njit
def func(array):
    return array


assert func(builder).behavior is SOME_ATTRS
@agoose77 agoose77 added the bug The problem described is something that must be fixed label Oct 19, 2023
@ianna ianna self-assigned this Oct 20, 2023
@jpivarski
Copy link
Member

Fairly recently, the internal Numba type string for an array with behaviors was changed from including a string of the entire behavior dict (with repr) to just a hash (much faster).

On the way out, is it reconstituting the array by trying to turn that string back into a Python object? Because the hash is not reversible. Could that be related to the issue?

(I think round-tripping behaviors through Numba used to work, and I'm surprised that there aren't tests for it.)

@agoose77
Copy link
Collaborator Author

I haven't looked at the code in detail (I opened the issue to keep an eye on it!), but I think the issue is that dicts aren't hashable, so the hashing fails.

We should definitely add a test for this as part of the fix.

@ianna
Copy link
Collaborator

ianna commented Oct 24, 2023

Fairly recently, the internal Numba type string for an array with behaviors was changed from including a string of the entire behavior dict (with repr) to just a hash (much faster).

On the way out, is it reconstituting the array by trying to turn that string back into a Python object? Because the hash is not reversible. Could that be related to the issue?

(I think round-tripping behaviors through Numba used to work, and I'm surprised that there aren't tests for it.)

Yes, that's what I observe. The Python 'dict' is unhashable. The TypeError occures while unboxing the ArrayBuilder. Specifically in:

  File "/Users/yana/Projects/PR2763/awkward/src/awkward/_connect/numba/builder.py", line 78, in box_ArrayBuilder
    c.pyapi.serialize_object(arraybuildertype.behavior)
  File "/Users/yana/opt/anaconda3/envs/numba_py311/lib/python3.11/site-packages/numba/core/pythonapi.py", line 1410, in serialize_object
    gv = self.module.__serialized[obj]
         ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
TypeError: unhashable type: 'dict'

I think, we need to consider using the Numba Dict type here.

@jpivarski
Copy link
Member

Numba dicts are for lowering; the problem here is hashing.

>>> d = nb.typed.Dict()
>>> d["one"] = 1
>>> d["two"] = 2
>>> d["three"] = 3
>>> d
DictType[unicode_type,int64]<iv=None>({one: 1, two: 2, three: 3})
>>> hash(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'Dict'

Assuming that all of the values are hashable, we could convert them to and from tuples (since the key order in modern dicts—that is, after Python 3.5—is stable).

>>> d = {"one": 1, "two": 2, "three": 3}
>>> tuple_d = tuple(d.items())
>>> hash(tuple_d)
-1150126290131628349
>>> dict(tuple_d)
{'one': 1, 'two': 2, 'three': 3}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants