Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError when using loc with TimedeltaIndex #14946

Closed
jdeschenes opened this issue Dec 21, 2016 · 1 comment
Closed

ValueError when using loc with TimedeltaIndex #14946

jdeschenes opened this issue Dec 21, 2016 · 1 comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Timedelta Timedelta data type
Milestone

Comments

@jdeschenes
Copy link
Contributor

jdeschenes commented Dec 21, 2016

This snippet will work correctly:

import pandas as pd
df = pd.DataFrame({'x':range(10)})
cond = df['x'] > 3
df.loc[cond, 'x'] = 10 # Works properly

But when using a TimedeltaIndex:

import pandas as pd
df = pd.DataFrame({'x':range(10)})
df.index = pd.to_timedelta(range(10), unit='s')
cond = df['x'] > 3
df.loc[cond, 'x'] = 10 # ValueError is raised

pandas 0.18.1 has a different problem, but there is a workaround. Simple use the underlying numpy array:

df.loc[cond.values, 'x'] = 10 # Works as intended
df.loc[cond, 'x'] = 10 # No error, wrong results

Ouput


ValueError Traceback (most recent call last)
in ()
----> 1 df.loc[cond, 'x'] = 10

~/pandas/pandas/core/indexing.py in setitem(self, key, value)
138 else:
139 key = com._apply_if_callable(key, self.obj)
--> 140 indexer = self._get_setitem_indexer(key)
141 self._setitem_with_indexer(indexer, value)
142

~/pandas/pandas/core/indexing.py in _get_setitem_indexer(self, key)
120
121 if isinstance(key, tuple) and not self.ndim < len(key):
--> 122 return self._convert_tuple(key, is_setter=True)
123 if isinstance(key, range):
124 return self._convert_range(key, is_setter=True)

~/pandas/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
182 else:
183 for i, k in enumerate(key):
--> 184 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
185 keyidx.append(idx)
186 return tuple(keyidx)

~/pandas/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
1149 # if we are a label return me
1150 try:
-> 1151 return labels.get_loc(obj)
1152 except LookupError:
1153 if isinstance(obj, tuple) and isinstance(labels, MultiIndex):

~/pandas/pandas/tseries/tdi.py in get_loc(self, key, method, tolerance)
675 """
676
--> 677 if isnull(key):
678 key = tslib.NaT
679

~/pandas/pandas/core/generic.py in nonzero(self)
915 raise ValueError("The truth value of a {0} is ambiguous. "
916 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 917 .format(self.class.name))
918
919 bool = nonzero

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Versions

Here is my pd.show_versions(). It also has been tested on windows7 64 bits with version 0.19

INSTALLED VERSIONS

commit: f79bc7a
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-57-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.19.0+243.gf79bc7a
nose: None
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Dec 21, 2016

this is quite subtle. When a boolean array is presented, .get_loc() should raise a TypeError if it cannot handle (and so this hits a different path). However, looks like the .get_loc on a TimedeltaIndex is wrong.

This patch makes it work. If you could do a PR with some tests and a 1-liner about why this is needed, would be great. The basic issue is that .get_loc should raise a TypeError if it cannot handle the type of the key (rather than erroring with a ValueError in this case).

diff --git a/pandas/tseries/tdi.py b/pandas/tseries/tdi.py
index 1585aac..1e4986a 100644
--- a/pandas/tseries/tdi.py
+++ b/pandas/tseries/tdi.py
@@ -14,7 +14,8 @@ from pandas.types.common import (_TD_DTYPE,
                                  _ensure_int64)
 from pandas.types.missing import isnull
 from pandas.types.generic import ABCSeries
-from pandas.core.common import _maybe_box, _values_from_object
+from pandas.core.common import (_maybe_box, _values_from_object,
+                                is_bool_indexer)
 
 from pandas.core.index import Index, Int64Index
 import pandas.compat as compat
@@ -674,6 +675,9 @@ class TimedeltaIndex(DatetimeIndexOpsMixin, TimelikeOps, Int64Index):
         loc : int
         """
 
+        if is_bool_indexer(key):
+            raise TypeError
+
         if isnull(key):
             key = tslib.NaT

@jreback jreback added Bug Difficulty Novice Indexing Related to indexing on series/frames, not to indexes themselves Timedelta Timedelta data type labels Dec 21, 2016
@jreback jreback added this to the 0.20.0 milestone Dec 21, 2016
jdeschenes pushed a commit to jdeschenes/pandas that referenced this issue Dec 22, 2016
mroeschke added a commit to mroeschke/pandas that referenced this issue Jan 26, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
…dev#14946)

closes pandas-dev#14946

Author: Matt Roeschke <emailformattr@gmail.com>

Closes pandas-dev#15221 from mroeschke/fix_14946 and squashes the following commits:

b8ac04e [Matt Roeschke] BUG: TimedelaIndex raising ValueError when boolean indexing (pandas-dev#14946)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants