Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support reading of custom-length RNTuple floats and suppressed columns #1347

Merged
merged 9 commits into from
Dec 19, 2024

Conversation

ariostas
Copy link
Collaborator

@ariostas ariostas commented Dec 6, 2024

This PR implements the reading of Real32Trunc and Real32Quant fields, which have a variable number of bits in the ranges 10-31 and 1-32, respectively.

It also adds support for suppressed columns, at least in simple cases.

@ariostas
Copy link
Collaborator Author

It ended up only taking a few lines to support suppressed columns. I'll improve the reading of floats and add some tests.

@ariostas ariostas changed the title feat: support reading of custom-length RNTuple floats feat: support reading of custom-length RNTuple floats and suppressed columns Dec 13, 2024
@ariostas ariostas marked this pull request as ready for review December 13, 2024 19:46
@ariostas
Copy link
Collaborator Author

This PR is ready for review, but it needs scikit-hep/scikit-hep-testdata#167 for the tests to pass.

Also, since there is still an issue with Dask, we'll just have to wait until that is resolved.

@ariostas
Copy link
Collaborator Author

ariostas commented Dec 19, 2024

@jpivarski I fixed the issue with Numpy 1 and I improved the tests. I still had to use np.isclose for some of the comparisons due to small differences, probably from ROOT rounding things a little differently.

@ariostas ariostas requested a review from jpivarski December 19, 2024 18:31
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for comparisons with ROOT to involve isclose with tight tolerances (around the scale of numerical precision). What I meant was that if you know you'll be truncating at the Nth digit, it's better to test against an expected value that's truncated at the Nth digit, with isclose picking up any small errors, rather than using isclose to check for differences with respect to the original value. My reasoning was just that the Nth digit might be a huge error and leaving a window wide open like that could fail to identify some errors.

As a whole, this PR looks great and I'd say it's ready to be merged. This is the PR that brings RNTuple-reading up to 100% coverage, right?

@ariostas
Copy link
Collaborator Author

This is the PR that brings RNTuple-reading up to 100% coverage, right?

Yes, as far as I can tell, after this PR we'll have 100% coverage of the current spec.

@ariostas ariostas merged commit 4ee1a2d into main Dec 19, 2024
26 checks passed
@ariostas ariostas deleted the ariostas/rntuple_floats branch December 19, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants