Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awkward_form() context arg is not optional #614

Closed
btovar opened this issue Jun 21, 2022 · 5 comments · Fixed by #618
Closed

awkward_form() context arg is not optional #614

btovar opened this issue Jun 21, 2022 · 5 comments · Fixed by #618
Labels
bug The problem described is something that must be fixed

Comments

@btovar
Copy link
Contributor

btovar commented Jun 21, 2022

pr #611 consolidated most arguments to awkward_form() into a single dictionary context. From the wording of the documentation:

context (dict): Context for the Form-generation; defaults are
                ``{"index_format": "i64", "header": False, "tobject_header": True, "breadcrumbs": ()}``.
                See below for context argument descriptions.

this seems to imply that context is optional, and that no other argument is required. If this is not the case, are callers of the function expected to provide {}, None, or as above {"index_format": ...} ?

From the current callers perspective that use the defaults, it would be easiest if context is optional, as no changes downstream would be needed.

(This issue became apparent with release 4.2.4.)

@btovar btovar added the bug (unverified) The problem described would be a bug, but needs to be triaged label Jun 21, 2022
@jpivarski
Copy link
Member

Those arguments are required; they were consolidated into a context because more will be added and we want to be able to do that without touching 400 awkward_form calls throughout the codebase.

Are you using this function? I had thought of it as an internal step in the process of executing array/arrays/iterate. Is there a particular class from which you call it, so that that class can be made to insert the defaults?

If any of the awkward_form methods are being called, we could make context=None by default everywhere and have that be a signal to inject centrally maintained defaults.


To directly answer your question, the required argument is the explicit dict given in the docstring, though that's not a friendly user interface (wasn't intended to be...).

@jpivarski jpivarski added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Jun 21, 2022
@btovar
Copy link
Contributor Author

btovar commented Jun 21, 2022

Sounds good. We don't use that method directly, but it is used by coffea, e.g., here:

https://github.com/andrewhennessee/coffea/blob/44396eab7e2ed580fae982768edf3fb8ea27d3f1/coffea/nanoevents/mapping/uproot.py#L94

I'll file an issue/pr with them. Thanks!

@jpivarski
Copy link
Member

PR #618 reverted the Interpretation.awkward_form argument list, which was merged into main-v4 (and main), and main-v4 was used to release Uproot 4.3.0, the first in the Uproot 4 bug-fix line. From our end, it seems to have worked; let me know if updating to the latest Uproot doesn't fix this problem.

@nsmith-
Copy link
Member

nsmith- commented Jun 22, 2022

We use some other private methods near this one:
https://github.com/CoffeaTeam/coffea/blob/master/coffea/nanoevents/mapping/uproot.py#L100-L105
Perhaps this should change as well?

@jpivarski
Copy link
Member

I don't know of a reason to change those. The awkward_form methods (specifically) are

  • numerous: definitions and references appeared in 400 places throughout the codebase
  • have a lot of arguments, and whenever we want to do a new thing with them, another argument had to be added (that's why we got index_format, header, tobject_header distinct from header, etc.). Every time an argument is added, it means making changes at all 400 locations, so I did it one last time, turning all of the arguments into a single context dict. Now when we want to add functionality, we add keys to the dict. This is already how the read methods work.

The "fixing" and "removing Uproot parameters" functions don't have that problem. In fact, these two steps in the process are probably going to be removed: they take data generated by a slow Python crawl, call ak.from_iter on them, and then fix the data types for things like Python intnp.int64np.int32 if the TBranch was 32-bit. That won't be necessary when the Forth code generates Awkward Arrays with the right types to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants