Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Internal / External values #19558

Merged
merged 42 commits into from
Feb 13, 2018
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
41f09d8
REF/Clean: Internal / External values
TomAugspurger Feb 3, 2018
29cfd7c
Move to index base
TomAugspurger Feb 6, 2018
3185f4e
Cleanup unique handling
TomAugspurger Feb 7, 2018
5a59591
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 7, 2018
476f75d
Simplify object concat
TomAugspurger Feb 7, 2018
b15ee5a
Use values for intersection
TomAugspurger Feb 7, 2018
659073f
hmm
TomAugspurger Feb 7, 2018
7accb67
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 8, 2018
9b8d2a5
Additional testing
TomAugspurger Feb 8, 2018
9fbac29
More tests
TomAugspurger Feb 8, 2018
55305dc
ndarray_values
TomAugspurger Feb 8, 2018
0e63708
API: Default ExtensionArray.astype
TomAugspurger Feb 8, 2018
fbbbc8a
Simplify concat_as_object
TomAugspurger Feb 8, 2018
46a0a49
Py2 compat
TomAugspurger Feb 8, 2018
2c4445a
Set-ops ugliness
TomAugspurger Feb 8, 2018
5612cda
better docstrings
TomAugspurger Feb 8, 2018
b012c19
tolist
TomAugspurger Feb 8, 2018
d49e6aa
linting
TomAugspurger Feb 8, 2018
d7d31ee
Moved dtypes
TomAugspurger Feb 9, 2018
7b89f1b
clean
TomAugspurger Feb 9, 2018
b0dbffd
cleanup
TomAugspurger Feb 9, 2018
66b936f
NumPy compat
TomAugspurger Feb 9, 2018
32ee0ef
Use base _values for CategoricalIndex
TomAugspurger Feb 9, 2018
a9882e2
Update dev docs
TomAugspurger Feb 9, 2018
f53652a
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 9, 2018
2425621
cleanup
TomAugspurger Feb 9, 2018
512fb89
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 9, 2018
170d0c7
Linting
TomAugspurger Feb 9, 2018
402620f
Precision in tests
TomAugspurger Feb 9, 2018
d9e8dd6
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 9, 2018
815d202
Push _ndarray_values to ExtensionArray
TomAugspurger Feb 11, 2018
a727b21
Clean up tolist
TomAugspurger Feb 11, 2018
f368c29
Move test locations
TomAugspurger Feb 11, 2018
d74c5c9
Fixed test
TomAugspurger Feb 12, 2018
8104ee5
REF: Update per comments
TomAugspurger Feb 12, 2018
f8e29b9
lint
TomAugspurger Feb 12, 2018
0cd9faa
REF: Use _values for size and shape
TomAugspurger Feb 12, 2018
8fcdb70
PERF: Implement size, shape for IntervalIndex
TomAugspurger Feb 12, 2018
34a6a22
PERF: Avoid materializing values for PeriodIndex shape, size
TomAugspurger Feb 12, 2018
c233c28
Merge remote-tracking branch 'upstream/master' into index-values
TomAugspurger Feb 13, 2018
d6e8051
Cleanup
TomAugspurger Feb 13, 2018
3af8a21
Override nbytes
TomAugspurger Feb 13, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions doc/source/internals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,25 @@ not check (or care) whether the levels themselves are sorted. Fortunately, the
constructors ``from_tuples`` and ``from_arrays`` ensure that this is true, but
if you compute the levels and labels yourself, please be careful.

Values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could add section tags

~~~~~~

Pandas extends NumPy's type system with custom types, like ``Categorical`` or
datetimes with a timezone, so we have multiple notions of "values". For 1-D
containers (``Index`` classes and ``Series``) we have the following convention:

* ``cls._ndarray_values`` is *always* a NumPy ``ndarray``. Ideally,
``_ndarray_values`` is cheap to compute. For example, for a ``Categorical``,
this returns the codes, not the array of objects.
* ``cls._values`` refers is the "best possible" array. This could be an
``ndarray``, ``ExtensionArray``, or in ``Index`` subclass (note: we're in the
process of removing the index subclasses here so that it's always an
``ndarray`` or ``ExtensionArray``).

So, for example, ``Series[category]._values`` is a ``Categorical``, while
``Series[category]._ndarray_values`` is the underlying codes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section does not belong in the same document as "how to subclass", as this part is really internals for contributors to pandas.
So I would maybe split this in two: external vs internal details, where the future documentation on how to use and define ExtensionArrays would also go into the external details part (maybe "Extending Pandas" would be a good title, with then information about the different strategies: accessor, extension array, subclassing, ..)

But that's for a separate PR (I can start with that), so for here the above is fine for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal-internals.rst and external-internals.rst ;)

I think that "How to subclass", and the eventual "extending pandas with custom array types" would be better in developer.rst, which is labeled as "This section will focus on downstream applications of pandas.".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I didn't see that the accessor documentation is actually already there (although, I personally don't find the parquet section that fitting in there, as it is not something typical you need to know when extending pandas. I can already start with moving it to the bottom of the file :-))



.. _ref-subclassing-pandas:

Subclassing pandas Data Structures
Expand Down
34 changes: 26 additions & 8 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
is_list_like,
is_scalar,
is_datetimelike,
is_categorical_dtype,
is_extension_type)

from pandas.util._validators import validate_bool_kwarg
Expand Down Expand Up @@ -710,7 +711,7 @@ def transpose(self, *args, **kwargs):
@property
def shape(self):
""" return a tuple of the shape of the underlying data """
return self._values.shape
return self._ndarray_values.shape

@property
def ndim(self):
Expand Down Expand Up @@ -738,22 +739,22 @@ def data(self):
@property
def itemsize(self):
""" return the size of the dtype of the item of the underlying data """
return self._values.itemsize
return self._ndarray_values.itemsize

@property
def nbytes(self):
""" return the number of bytes in the underlying data """
return self._values.nbytes
return self._ndarray_values.nbytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should nbytes be called on self._values instead? (as we require a nbytes implementation on the Extension Array, and numpy arrays already have it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this caused issues for CI, but re-running tests now with this change.


@property
def strides(self):
""" return the strides of the underlying data """
return self._values.strides
return self._ndarray_values.strides

@property
def size(self):
""" return the number of elements in the underlying data """
return self._values.size
return self._ndarray_values.size

@property
def flags(self):
Expand All @@ -768,8 +769,21 @@ def base(self):
return self.values.base

@property
def _values(self):
""" the internal implementation """
def _ndarray_values(self):
"""The data as an ndarray, possibly losing information.

The expectation is that this is cheap to compute.

- categorical -> codes

See '_values' for more.
"""
# type: () -> np.ndarray
from pandas.core.dtypes.common import is_categorical_dtype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does NOT belong here. you already have a sub-clss EA for Categorical that can simply override this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is to return _ndarray_values for both Series and Index, and for Series there is in general no subclass that can override it.
Of course we could say to put this logic in the Blocks, but that is then also not for Index, so not sure it is better.

This raises the question for me, though, what this will return for external extension types. Since it is is .values, I suppose this is a materialized array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved it all the way to ExtensionArray, with the default being just returning an np.array(self), same as before. It's not part of the interface though. I think is what @jorisvandenbossche suggested here.


if is_categorical_dtype(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have this at all, e.g. np.array(ea) should just do the right thing no? why are we adding something else? note I actually don't mind that we have an additional property that we use consistently, more to the point of why __array__ does not just return _ndarray_values

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because __array__ returns materialized array (so eg array of strings if you have string categories), not the codes.

But I agree it points to something we should think about how to organize this, as also eg for periods there will be a special case here in the future. So maybe we need a property on our own extension arrays that gives back this ndarray? (which is not necessarily part of the external interface for extension arrays)

return self._values.codes

return self.values

@property
Expand Down Expand Up @@ -819,8 +833,10 @@ def tolist(self):

if is_datetimelike(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be overriden in EA, rather than specific dispatching via if/else here, IOW it should be a part of the interface, or be defined as list(.values)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure it is needed to add a tolist to the interface, I think in general it can rely on __iter__/__getitem__ (so list(self._values)).
The only problem in that case is that we still need to distinguish between normal numeric types (where the above would return numpy scalars and not python scalars) and other types where this already gives the correct result.

return [com._maybe_box_datetimelike(x) for x in self._values]
elif is_categorical_dtype(self):
return self.values.tolist()
else:
return self._values.tolist()
return self._ndarray_values.tolist()

def __iter__(self):
"""
Expand Down Expand Up @@ -978,7 +994,9 @@ def value_counts(self, normalize=False, sort=True, ascending=False,
def unique(self):
values = self._values

# TODO: Make unique part of the ExtensionArray interface.
if hasattr(values, 'unique'):

result = values.unique()
else:
from pandas.core.algorithms import unique1d
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -927,7 +927,7 @@ def try_timedelta(v):
# will try first with a string & object conversion
from pandas import to_timedelta
try:
return to_timedelta(v)._values.reshape(shape)
return to_timedelta(v)._ndarray_values.reshape(shape)
except Exception:
return v.reshape(shape)

Expand Down
8 changes: 5 additions & 3 deletions pandas/core/dtypes/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -480,20 +480,22 @@ def _concat_datetimetz(to_concat, name=None):

def _concat_index_same_dtype(indexes, klass=None):
klass = klass if klass is not None else indexes[0].__class__
return klass(np.concatenate([x._values for x in indexes]))
return klass(np.concatenate([x._ndarray_values for x in indexes]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is only used for numeric indices, so _values or _ndarray_values should not matter.
I was mainly thinking that for those cases where _ndarray_values actually differs from _values, like a categorical, the above code will not work anyway, as klass(codes) is not enough to reconstruct the categorical. So _values seems safer to me.



def _concat_index_asobject(to_concat, name=None):
"""
concat all inputs as object. DatetimeIndex, TimedeltaIndex and
PeriodIndex are converted to object dtype before concatenation
"""
from pandas import Index
from pandas.core.arrays import ExtensionArray

klasses = ABCDatetimeIndex, ABCTimedeltaIndex, ABCPeriodIndex
klasses = (ABCDatetimeIndex, ABCTimedeltaIndex, ABCPeriodIndex,
ExtensionArray)
to_concat = [x.astype(object) if isinstance(x, klasses) else x
for x in to_concat]

from pandas import Index
self = to_concat[0]
attribs = self._get_attributes_dict()
attribs['name'] = name
Expand Down
108 changes: 83 additions & 25 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,14 @@
is_object_dtype,
is_categorical_dtype,
is_interval_dtype,
is_period_dtype,
is_bool,
is_bool_dtype,
is_signed_integer_dtype,
is_unsigned_integer_dtype,
is_integer_dtype, is_float_dtype,
is_datetime64_any_dtype,
is_datetime64tz_dtype,
is_timedelta64_dtype,
needs_i8_conversion,
is_iterator, is_list_like,
Expand Down Expand Up @@ -412,7 +414,7 @@ def _simple_new(cls, values, name=None, dtype=None, **kwargs):
values = np.array(values, copy=False)
if is_object_dtype(values):
values = cls(values, name=name, dtype=dtype,
**kwargs)._values
**kwargs)._ndarray_values

result = object.__new__(cls)
result._data = values
Expand Down Expand Up @@ -594,6 +596,40 @@ def values(self):
""" return the underlying data as an ndarray """
return self._data.view(np.ndarray)

@property
def _values(self):
# type: () -> Union[ExtensionArray, Index]
# TODO: remove index types as they become is extension arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove 'is'

"""The best array representation.

This is an ndarray, ExtensionArray, or Index subclass. This differs
from ``_ndarray_values``, which always returns an ndarray.

Both ``_values`` and ``_ndarray_values`` are consistent between
``Series`` and ``Index``.

It may differ from the public '.values' method.

index | values | _values | _ndarray_values |
----------------- | -------------- -| ----------- | --------------- |
CategoricalIndex | Categorical | Categorical | codes |
DatetimeIndex[tz] | ndarray[M8ns] | DTI[tz] | ndarray[M8ns] |

For the following, the ``._values`` is currently ``ndarray[object]``,
but will soon be an ``ExtensionArray``

index | values | _values | _ndarray_values |
----------------- | --------------- | ------------ | --------------- |
PeriodIndex | ndarray[object] | ndarray[obj] | ndarray[int] |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the _values, shouldn't this be PeriodIndex for now? Similar as for DatetimeTZ now has DTI[tz], and so in the future it would become PeriodArray.

(but not sure where this is currently actually used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so, since Series[Period]._values returns an ndarray of objects. Series[datetime-with-TZ]._values is the only special one, and I'm only changing Index._values to match Series._values for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that is because we don't have a PeriodBlock that is backed by a PeriodIndex, as is the case for DatetimeTZ, so at the moment you cannot really have Period values in a series.
I am not saying that this should be the case now, only that I think it might be more logical. But, it might well be that we then first need to add a working "PeriodBlock" (ExtensionBlock backed by PeriodIndex/PeriodArray) to have this work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what your proposing / saying then. Is it just that ._values being a PeriodIndex would be more consistent with ._values returning a DatetimeIndex?

What we have now "works", but is internally inconsistent. I'm updating Index._values to be consistent with Series._values, and changing uses of .values / ._values / ._ndarray_values to be the correct one for that use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just that ._values being a PeriodIndex would be more consistent with ._values returning a DatetimeIndex?

Yes. So regardless of what ._values currently does or doesn't return for Index/Series. Just from looking at your table, returning PeriodIndex would make more sense IMO, as this is (at this moment) the closest thing to an array-like that preserves the information.
But, I don't know what implication it would have for Series ops that ._values would start returning a PeriodIndex instead of object array of Periods, so it was really just a question, not a request for change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory these should return an Index type for ._values for PI and II, and it might work, but would be disruptive w/o a Block type to hold it. This is why for example we have the is_period_arraylike methods to detect an array of Periods. The holder ATM is a ndarray[object].

IntervalIndex | ndarray[object] | ndarray[obj] | ndarray[object] |

See Also
--------
values
_ndarray_values
"""
return self.values

def get_values(self):
""" return the underlying data as an ndarray """
return self.values
Expand Down Expand Up @@ -664,7 +700,7 @@ def ravel(self, order='C'):
--------
numpy.ndarray.ravel
"""
return self._values.ravel(order=order)
return self._ndarray_values.ravel(order=order)

# construction helpers
@classmethod
Expand Down Expand Up @@ -1597,7 +1633,7 @@ def _constructor(self):
@cache_readonly
def _engine(self):
# property, for now, slow to look up
return self._engine_type(lambda: self._values, len(self))
return self._engine_type(lambda: self._ndarray_values, len(self))

def _validate_index_level(self, level):
"""
Expand Down Expand Up @@ -2228,27 +2264,37 @@ def union(self, other):
other = other.astype('O')
return this.union(other)

# TODO: setops-refactor, clean all this up
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if/then needs to all be completely removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't be removed until setops are fix more generally. Like I said, I have a PR started, but it's a bigger issue than what's on the critical path for ExtensionArray.

Take CategoricalIndex.intersection We relied on CategoricalIndex._outer_indexer(self._values, other._values) raising a TypeError to get the correct result. If I pass CategoricalIndex._ndarray_values, we don't get a type error any more, but we get the wrong result.

There's no way to avoid the if statements until set ops are completely refactored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's do this next then. these if/then/else are really smelly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if might be worth it to put this in a single method now (not sure if its possible) to isolate this code, maybe

def _get_internal_type(other):
     if is_period_dtype(other) ......
          ...
    else:
         ...

if is_period_dtype(self) or is_datetime64tz_dtype(self):
lvals = self._ndarray_values
else:
lvals = self._values
if is_period_dtype(other) or is_datetime64tz_dtype(other):
rvals = other._ndarray_values
else:
rvals = other._values

if self.is_monotonic and other.is_monotonic:
try:
result = self._outer_indexer(self._values, other._values)[0]
result = self._outer_indexer(lvals, rvals)[0]
except TypeError:
# incomparable objects
result = list(self._values)
result = list(lvals)

# worth making this faster? a very unusual case
value_set = set(self._values)
result.extend([x for x in other._values if x not in value_set])
value_set = set(lvals)
result.extend([x for x in rvals if x not in value_set])
else:
indexer = self.get_indexer(other)
indexer, = (indexer == -1).nonzero()

if len(indexer) > 0:
other_diff = algos.take_nd(other._values, indexer,
other_diff = algos.take_nd(rvals, indexer,
allow_fill=False)
result = _concat._concat_compat((self._values, other_diff))
result = _concat._concat_compat((lvals, other_diff))

try:
self._values[0] < other_diff[0]
lvals[0] < other_diff[0]
except TypeError as e:
warnings.warn("%s, sort order is undefined for "
"incomparable objects" % e, RuntimeWarning,
Expand All @@ -2260,7 +2306,7 @@ def union(self, other):
result.sort()

else:
result = self._values
result = lvals

try:
result = np.sort(result)
Expand Down Expand Up @@ -2311,20 +2357,30 @@ def intersection(self, other):
other = other.astype('O')
return this.intersection(other)

# TODO: setops-refactor, clean all this up
if is_period_dtype(self):
lvals = self._ndarray_values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

else:
lvals = self._values
if is_period_dtype(other):
rvals = other._ndarray_values
else:
rvals = other._values

if self.is_monotonic and other.is_monotonic:
try:
result = self._inner_indexer(self._values, other._values)[0]
result = self._inner_indexer(lvals, rvals)[0]
return self._wrap_union_result(other, result)
except TypeError:
pass

try:
indexer = Index(other._values).get_indexer(self._values)
indexer = Index(rvals).get_indexer(lvals)
indexer = indexer.take((indexer != -1).nonzero()[0])
except Exception:
# duplicates
indexer = algos.unique1d(
Index(other._values).get_indexer_non_unique(self._values)[0])
Index(rvals).get_indexer_non_unique(lvals)[0])
indexer = indexer[indexer != -1]

taken = other.take(indexer)
Expand Down Expand Up @@ -2700,7 +2756,7 @@ def get_indexer(self, target, method=None, limit=None, tolerance=None):
raise ValueError('limit argument only valid if doing pad, '
'backfill or nearest reindexing')

indexer = self._engine.get_indexer(target._values)
indexer = self._engine.get_indexer(target._ndarray_values)

return _ensure_platform_int(indexer)

Expand All @@ -2716,12 +2772,13 @@ def _get_fill_indexer(self, target, method, limit=None, tolerance=None):
if self.is_monotonic_increasing and target.is_monotonic_increasing:
method = (self._engine.get_pad_indexer if method == 'pad' else
self._engine.get_backfill_indexer)
indexer = method(target._values, limit)
indexer = method(target._ndarray_values, limit)
else:
indexer = self._get_fill_indexer_searchsorted(target, method,
limit)
if tolerance is not None:
indexer = self._filter_indexer_tolerance(target._values, indexer,
indexer = self._filter_indexer_tolerance(target._ndarray_values,
indexer,
tolerance)
return indexer

Expand Down Expand Up @@ -2812,7 +2869,7 @@ def get_indexer_non_unique(self, target):
self = Index(self.asi8)
tgt_values = target.asi8
else:
tgt_values = target._values
tgt_values = target._ndarray_values

indexer, missing = self._engine.get_indexer_non_unique(tgt_values)
return _ensure_platform_int(indexer), missing
Expand Down Expand Up @@ -3247,16 +3304,17 @@ def _join_multi(self, other, how, return_indexers=True):
def _join_non_unique(self, other, how='left', return_indexers=False):
from pandas.core.reshape.merge import _get_join_indexers

left_idx, right_idx = _get_join_indexers([self._values],
[other._values], how=how,
left_idx, right_idx = _get_join_indexers([self._ndarray_values],
[other._ndarray_values],
how=how,
sort=True)

left_idx = _ensure_platform_int(left_idx)
right_idx = _ensure_platform_int(right_idx)

join_index = np.asarray(self._values.take(left_idx))
join_index = np.asarray(self._ndarray_values.take(left_idx))
mask = left_idx == -1
np.putmask(join_index, mask, other._values.take(right_idx))
np.putmask(join_index, mask, other._ndarray_values.take(right_idx))

join_index = self._wrap_joined_index(join_index, other)

Expand Down Expand Up @@ -3403,8 +3461,8 @@ def _join_monotonic(self, other, how='left', return_indexers=False):
else:
return ret_index

sv = self._values
ov = other._values
sv = self._ndarray_values
ov = other._ndarray_values

if self.is_unique and other.is_unique:
# We can perform much better than the general case
Expand Down Expand Up @@ -3756,7 +3814,7 @@ def insert(self, loc, item):
item = self._na_value

_self = np.asarray(self)
item = self._coerce_scalar_to_index(item)._values
item = self._coerce_scalar_to_index(item)._ndarray_values
idx = np.concatenate((_self[:loc], item, _self[loc:]))
return self._shallow_copy_with_infer(idx)

Expand Down
Loading