-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement some reductions for string Series #31757
Closed
Closed
Changes from 5 commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
c87ce47
Add reduction tests
5365fa1
Implement a few reductions
460030f
Fix test
6497e16
Add whatsnew
c052feb
skipna
4312b4f
Hardcode expected output
b67cc56
Merge branch 'master' into reduce-string
214995f
Skip tests
95c83fb
Merge branch 'master' into reduce-string
4082ae3
No else
12a2323
Modify nanminmax to better handle object dtype
4a61b81
Delegate to min and max methods
c6465e5
Fixup nanops
5c1a5fb
Import and clean up
ca99650
Update tests
18800e1
Update whatsnew
3eddb73
Limit to values.ndim == 1
0c1a2a3
Sort index in test
5714c06
Box in Series
5e01c3f
Add Series min / max test
0e4a4e2
Merge branch 'master' into reduce-string
ed01138
Move tests
1128950
Pass in mask and patch _maybe_get_mask
7c0f05d
Update whatsnew
d84f5bd
Merge branch 'master' into reduce-string
7db2052
Try to make mypy happy
e19bbe1
Update doc/source/whatsnew/v1.0.2.rst
dsaxton df99b4f
Add comment
ba20705
Merge branch 'master' into reduce-string
d69b587
Merge remote-tracking branch 'upstream/master' into reduce-string
dsaxton File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the masking should be done inside the methods themselves, _reduce just dispatches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we implement these methods for StringArray in that case? The NA handling for PandasArray seems to be broken for string inputs, so it might have to get handled within each method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would say don't care about PandasArray too much (since PandasArray is not using pd.NA), and just implement the methods here on StringArray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason why the NA-handling wasn't working was due to an apparently long-standing bug in nanops.nanminmax which I think we can fix here: #18588. Basically we are filling NA with infinite values when taking the min or max, but this doesn't make sense for object dtypes and an error gets raised even if skipna is True.
If we fix that by explicitly masking the missing values instead, I believe we can just use this function directly in StringArray methods.