-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Don't allow passing missing data to generalized ufuncs #16198
fix(python): Don't allow passing missing data to generalized ufuncs #16198
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #16198 +/- ##
==========================================
- Coverage 80.99% 80.91% -0.08%
==========================================
Files 1392 1394 +2
Lines 178930 179574 +644
Branches 2904 2913 +9
==========================================
+ Hits 144925 145310 +385
- Misses 33502 33759 +257
- Partials 503 505 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good to me, just got some quite minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @itamarst !
Thank you @MarcoGorelli. If you have commit access you'll have to merge it too, since I can't. |
yup - just leaving it open another day in case anyone has comments, then I'll ship it outside of generalized ufuncs, there's no other case where missing values are a concern, right? |
For normal value-by-value ufuncs, if you have memory that is e.g. an invalid float64 then that could potentially be a problem? You're potentially feeding random garbage to the function for missing data. |
…ola-rs#16198) Co-authored-by: Itamar Turner-Trauring <itamar@pythonspeed.com>
Fixes #14811
This is a more constrained PR, just addressing one specific issue: passing missing data to generalized ufuncs can silently lead to incorrect results.
Generalized ufuncs are a NumPy feature (https://numpy.org/neps/nep-0020-gufunc-signature-enhancement.html), but the easiest way to implement them is using Numba, which is what the tests do.