Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: tuple-of-tuples indexing results in NumPy VisibleDeprecationWarning #35437

Closed

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins added Bug Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.). Compat pandas objects compatability with Numpy or Python functions labels Jul 28, 2020
@simonjayhawkins simonjayhawkins marked this pull request as draft July 28, 2020 15:50
Comment on lines +142 to +143
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=np.VisibleDeprecationWarning)
Copy link
Contributor

@TomAugspurger TomAugspurger Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting these can be somewhat expensive. Does this slow down DataFrame.__getitem__ or Series.__getitem__ at all?

Copy link
Member Author

@simonjayhawkins simonjayhawkins Jul 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not free, but not significant for ser[[tup]]

         1187 function calls (1183 primitive calls) in 0.002 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      4/2    0.000    0.000    0.001    0.000 base.py:289(__new__)
      281    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
       14    0.000    0.000    0.000    0.000 {built-in method numpy.array}
        4    0.000    0.000    0.000    0.000 {pandas._libs.lib.infer_dtype}
       94    0.000    0.000    0.000    0.000 generic.py:10(_check)
        1    0.000    0.000    0.002    0.002 {built-in method builtins.exec}
        5    0.000    0.000    0.000    0.000 common.py:221(asarray_tuplesafe)
        2    0.000    0.000    0.000    0.000 common.py:97(is_bool_indexer)
        2    0.000    0.000    0.000    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        1    0.000    0.000    0.000    0.000 indexing.py:1257(_validate_read_indexer)
        7    0.000    0.000    0.000    0.000 warnings.py:458(__enter__)
        6    0.000    0.000    0.000    0.000 _dtype.py:321(_name_get)
        1    0.000    0.000    0.000    0.000 {pandas._libs.algos.take_1d_int64_int64}
       20    0.000    0.000    0.000    0.000 common.py:1460(is_extension_array_dtype)
        1    0.000    0.000    0.000    0.000 algorithms.py:1586(take_nd)
        1    0.000    0.000    0.000    0.000 {method 'get_indexer' of 'pandas._libs.index.IndexEngine' objects}
       23    0.000    0.000    0.000    0.000 base.py:256(is_dtype)
        1    0.000    0.000    0.002    0.002 series.py:910(_get_with)
        1    0.000    0.000    0.000    0.000 base.py:2951(get_indexer)
      144    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
       20    0.000    0.000    0.000    0.000 base.py:413(find)
        7    0.000    0.000    0.000    0.000 warnings.py:181(_add_filter)
        1    0.000    0.000    0.000    0.000 generic.py:4493(_reindex_with_indexers)
        2    0.000    0.000    0.000    0.000 cast.py:441(maybe_promote)
        1    0.000    0.000    0.002    0.002 series.py:868(__getitem__)
        1    0.000    0.000    0.002    0.002 indexing.py:1078(_getitem_axis)
       21    0.000    0.000    0.000    0.000 common.py:1600(_is_dtype_type)
        2    0.000    0.000    0.000    0.000 base.py:5718(_maybe_cast_data_without_dtype)
        7    0.000    0.000    0.000    0.000 dtypes.py:1113(is_dtype)
        1    0.000    0.000    0.000    0.000 managers.py:1267(_slice_take_blocks_ax0)
        2    0.000    0.000    0.000    0.000 cast.py:1559(construct_1d_object_array_from_listlike)
        3    0.000    0.000    0.000    0.000 {built-in method numpy.empty}
        3    0.000    0.000    0.000    0.000 generic.py:377(_get_axis)
        1    0.000    0.000    0.001    0.001 indexing.py:1208(_get_listlike_indexer)
        1    0.000    0.000    0.000    0.000 blocks.py:1233(take_nd)
       76    0.000    0.000    0.000    0.000 {built-in method builtins.issubclass}
        9    0.000    0.000    0.000    0.000 common.py:1565(_get_dtype)
        6    0.000    0.000    0.000    0.000 _dtype.py:24(_kind_name)
        9    0.000    0.000    0.000    0.000 common.py:530(is_categorical_dtype)
        7    0.000    0.000    0.000    0.000 dtypes.py:901(is_dtype)
        9    0.000    0.000    0.000    0.000 common.py:492(is_interval_dtype)
        1    0.000    0.000    0.000    0.000 algorithms.py:1457(_get_take_nd_function)
        7    0.000    0.000    0.000    0.000 warnings.py:165(simplefilter)

what's the best approach?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 on catching the warming

this needs either an actual fix in index construction

@simonjayhawkins
Copy link
Member Author

this needs either an actual fix in index construction

to prevent nested tuples in the index?

@simonjayhawkins
Copy link
Member Author

I'll close this for now to clear the queue, I'll revisit once on top of 1.1.0 regressions. There could be lots to do here to avoid 7 repeated calls to np.asarray and maintain state for the indexer checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Compat pandas objects compatability with Numpy or Python functions Indexing Related to indexing on series/frames, not to indexes themselves Nested Data Data where the values are collections (lists, sets, dicts, objects, etc.).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Shift on a group column when column name is a tuple-of-tuples results in NumPy VisibleDeprecationWarning
3 participants