You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although it is possible to index MultiIndexed DataFrames with multiple index columns, one or more of which have a compound type, it is not possible to index an Indexed DataFrame with a compound type for its column, nor is it possible to index a MultiIndexed Dataframe with a single column that has a compound type.
tl;dr - I can't index a DataFrame with a namedtuple, even though I can create one.
In the first example, I try to index a dataframe with a namedtuple with a regular Index, which fails.
In the second example, I index a dataframe with a tuple of namedtuples (MultiIndex), which succeeds.
In the third example, I try to index a dataframe with a length-1 tuple of namedtuples, again with a MultiIndex, which fails.
from collections import namedtuple
import pandas
# First example
"""
>>> IndexType = namedtuple("IndexType", ["a", "b"])
>>> idx1 = IndexType("foo", "bar")
>>> idx2 = IndexType("baz", "bof")
>>> index = pandas.Index([idx1, idx2], name="composite_index")
>>> index
Index([IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof')], dtype=object)
>>> df = pandas.DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
>>> df
A B
composite_index..................
IndexType(a='foo', b='bar') 1 2
IndexType(a='baz', b='bof') 3 4
>>> df.ix[IndexType("foo", "bar")]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._getitem_tuple(key)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._getitem_lowerdim(tup)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
section = self._getitem_axis(key, axis=i)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._get_label(idx, axis=0)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self.obj.xs(label, axis=axis, copy=True)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
loc = self.index.get_loc(key)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._engine.get_loc(key)
File "engines.pyx", line 101, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2498)
File "engines.pyx", line 108, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2460)
KeyError: 'foo'
"""
# Second example
"""
>>> mult_index = pandas.MultiIndex.from_tuples([(idx1, idx2)], names=["comp_1", "comp_2"])
>>> mult_index
MultiIndex([(IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof'))], dtype=object)
>>> df = pandas.DataFrame([(1, 2, 3, 4)], index=mult_index, columns=["A", "B", "C", "D"])
>>> df
A B C D
comp_1 comp_2.................................
IndexType(a='foo', b='bar') IndexType(a='baz', b='bof') 1 2 3 4
>>> df.ix[(IndexType("foo", "bar"), IndexType("baz", "bof"))]
A 1
B 2
C 3
D 4
Name: (IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof'))
"""
# Third example
"""
>>> index = pandas.MultiIndex.from_tuples([(IndexType("foo", "bar"),), (IndexType("baz", "bof"),)], names=["ind#
>>> index
Index([IndexType(a='foo', b='bar'), IndexType(a='baz', b='bof')], dtype=object
>>> df = pandas.DataFrame([(1, 2), (3, 4)], index=index, columns=["A", "B"])
>>> df
A B
index............................
IndexType(a='foo', b='bar') 1 2
IndexType(a='baz', b='bof') 3 4
>>> df.ix[IndexType("foo", "bar")]
Traceback (most recent call last):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._getitem_tuple(key)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._getitem_lowerdim(tup)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
section = self._getitem_axis(key, axis=i)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._get_label(idx, axis=0)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self.obj.xs(label, axis=axis, copy=True)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
loc = self.index.get_loc(key)
File "/Network/Cluster/home/echlebek/.virtualenvs/pandas/lib/python2.6/site-packages/pandas-0.7.3.dev_3d4d5af#
return self._engine.get_loc(key)
File "engines.pyx", line 101, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2498)
File "engines.pyx", line 108, in pandas._engines.DictIndexEngine.get_loc (pandas/src/engines.c:2460)
KeyError: 'foo'
>>> df.ix[(IndexType("foo", "bar"),)]
A B
foo NaN NaN
bar NaN NaN
"""
The text was updated successfully, but these errors were encountered:
First of all, we realized the above test is reporting false positive because of #1069
Secondly, an additional problem lies here. In particular, _is_list_like prevents using any iterable object as an Index key.
At this point, it's a question of where you want to go with the indexing interface. I think might be reasonable to limit the types (aside from Index itself) used for supplying index sequences to, say, tuple, list and numpy.array. The upside is not having to think about adding more exceptions (currently there's basestring, plus, in our case, a tuple subclass); the downside is not supporting arbitrary iterables such as generators. I would personally be in favour of the former because it is the simplest of the two (internal logic and behaviour-wise) in the long run.
Although it is possible to index MultiIndexed DataFrames with multiple index columns, one or more of which have a compound type, it is not possible to index an Indexed DataFrame with a compound type for its column, nor is it possible to index a MultiIndexed Dataframe with a single column that has a compound type.
tl;dr - I can't index a DataFrame with a namedtuple, even though I can create one.
In the first example, I try to index a dataframe with a namedtuple with a regular Index, which fails.
In the second example, I index a dataframe with a tuple of namedtuples (MultiIndex), which succeeds.
In the third example, I try to index a dataframe with a length-1 tuple of namedtuples, again with a MultiIndex, which fails.
The text was updated successfully, but these errors were encountered: