-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve Extension type on cross section #22785
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
e8b37da
ENH: is_homogenous
TomAugspurger 0197e0c
BUG: Preserve dtype on homogeneous EA xs
TomAugspurger 62326ae
asarray test
TomAugspurger f008c38
Fixed asarray
TomAugspurger 88c6126
Merge remote-tracking branch 'upstream/master' into ea-xs
TomAugspurger 78798cf
is_homogeneous -> is_homogeneous_type
TomAugspurger b051424
lint
TomAugspurger d6a2479
lint
TomAugspurger f796138
Merge remote-tracking branch 'upstream/master' into ea-xs
TomAugspurger 78dd81e
Moved whatsnew to correct section
TomAugspurger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,9 +12,6 @@ | |
from pandas.util._validators import validate_bool_kwarg | ||
from pandas.compat import range, map, zip | ||
|
||
from pandas.core.dtypes.dtypes import ( | ||
ExtensionDtype, | ||
PandasExtensionDtype) | ||
from pandas.core.dtypes.common import ( | ||
_NS_DTYPE, | ||
is_datetimelike_v_numeric, | ||
|
@@ -791,6 +788,11 @@ def _interleave(self): | |
""" | ||
dtype = _interleaved_dtype(self.blocks) | ||
|
||
if is_extension_array_dtype(dtype): | ||
# TODO: https://github.com/pandas-dev/pandas/issues/22791 | ||
# Give EAs some input on what happens here. Sparse needs this. | ||
dtype = 'object' | ||
|
||
result = np.empty(self.shape, dtype=dtype) | ||
|
||
if result.shape[0] == 0: | ||
|
@@ -906,14 +908,25 @@ def fast_xs(self, loc): | |
|
||
# unique | ||
dtype = _interleaved_dtype(self.blocks) | ||
|
||
n = len(items) | ||
result = np.empty(n, dtype=dtype) | ||
if is_extension_array_dtype(dtype): | ||
# we'll eventually construct an ExtensionArray. | ||
result = np.empty(n, dtype=object) | ||
else: | ||
result = np.empty(n, dtype=dtype) | ||
|
||
for blk in self.blocks: | ||
# Such assignment may incorrectly coerce NaT to None | ||
# result[blk.mgr_locs] = blk._slice((slice(None), loc)) | ||
for i, rl in enumerate(blk.mgr_locs): | ||
result[rl] = blk._try_coerce_result(blk.iget((i, loc))) | ||
|
||
if is_extension_array_dtype(dtype): | ||
result = dtype.construct_array_type()._from_sequence( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this gauaranteed to be 1d at this point? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
result, dtype=dtype | ||
) | ||
|
||
return result | ||
|
||
def consolidate(self): | ||
|
@@ -1855,16 +1868,22 @@ def _shape_compat(x): | |
|
||
|
||
def _interleaved_dtype(blocks): | ||
if not len(blocks): | ||
return None | ||
# type: (List[Block]) -> Optional[Union[np.dtype, ExtensionDtype]] | ||
"""Find the common dtype for `blocks`. | ||
|
||
dtype = find_common_type([b.dtype for b in blocks]) | ||
Parameters | ||
---------- | ||
blocks : List[Block] | ||
|
||
# only numpy compat | ||
if isinstance(dtype, (PandasExtensionDtype, ExtensionDtype)): | ||
dtype = np.object | ||
Returns | ||
------- | ||
dtype : Optional[Union[np.dtype, ExtensionDtype]] | ||
None is returned when `blocks` is empty. | ||
""" | ||
if not len(blocks): | ||
return None | ||
|
||
return dtype | ||
return find_common_type([b.dtype for b in blocks]) | ||
|
||
|
||
def _consolidate(blocks): | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do people find this confusing? I can either
list.append
for EAs and inserting intoresult
for otherI chose this implementation because I assume it's slightly for wide dataframes with a numpy type, compared to building a list an then
np.asarray(result)
at the end.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation looks good to me