Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optionally pass typenames to Awkward Array #1369

Open
ianna opened this issue Jan 31, 2025 · 2 comments
Open

optionally pass typenames to Awkward Array #1369

ianna opened this issue Jan 31, 2025 · 2 comments
Labels
feature New feature or request

Comments

@ianna
Copy link
Collaborator

ianna commented Jan 31, 2025

events = uproot.dask(to_open,
                full_paths=True,
                open_files=False,
                ak_add_doc=True,
                filter_branch=_remove_not_interpretable,
                steps_per_file=steps_per_file,
                known_base_form=known_base_form,
                decompression_executor=decompression_executor,
                interpretation_executor=interpretation_executor,
                **uproot_options,
               form_mapping = self._schema
            )

ends up with a list of branches. We'd like to get also the C++ types of those branches.

@jpivarski's proposal:

The typenames could, optionally, be passed through to the Awkward Array's __doc__.

Having the __doc__ be exactly equal to the TTree title is important for NanoAOD users, since the NanoAOD titles are one-line tool-tips that are valuable to have as mouse-overs in IDEs and such. The C++ typename could be added as a different parameter from __doc__, but only __doc__ shows up in tool-tips and Python's help function. The __doc__ could be formatted as f"{ttree.title} (tbranch.typename})" or similar, but we'd want that to be opt-in.
Perhaps the add_doc argument could be expanded from a boolean to an enum or something, to handle the different cases of what one might want.

@prayagyadav
Copy link

Hi, @ianna Thanks for posting this. I have a PR for this feature-request here #1375 . I have not tried the 'enum way' as @jpivarski suggested (I was clueless about how to do that), but simply added the ak_add_typename as an optional bool argument (identical to ak_add_doc). Please let me know, if this is a correct addition. Thanks.

@jpivarski
Copy link
Member

All I meant by "enums" was to define some strings to represent different choices. They don't have to use Python's enum module, though that would be a more formal way to do it. (Python enums aren't as fundamentally distinct from basic strings as C++'s enums are, in terms of type-safe compile-time checks and performance. In Python, you can get the type-safety of options for A, B, and C by declaring the type annotation as "A" | "B" | "C", and type-checking is optional, anyway.)

The one hard constraint is that the old value of ak_add_doc = False should continue to mean "no __doc__" and the old value of ak_add_doc = True should continue to mean "set __doc__ to the TBranch title", even as new cases are added with ak_add_doc = "..." (some string). We want to make sure that code using the old interface continues to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants