Problem when setting values based on MultiIndex subset #1537

gerigk · 2012-06-27T10:40:43Z

from pandas import *
test = read_csv('/home/arthur/transform_issue.csv')
x =test.groupby(['A','B','C'])['revenues'].first().index
test.set_index(['A','B','C'], inplace=True)
test.ix[x]['revenues']= 999.99

print test.ix[x].shape
print test[test.revenues==999.99]

(127, 5)
Empty DataFrame
Columns: array([D, E, week, revenues, orders], dtype=object)
Index: array([], dtype=object)

I then tried setting via df[col][indexes] which simply crashed my session without any exception

test = read_csv('/home/arthur/transform_issue.csv')
x =test.groupby(['A','B','C'])['revenues'].first().index
test.set_index(['A','B','C'], inplace=True)
test['revenues'][x]= 999.99

print test.ix[x].shape
print test[test.revenues==999.99]

I sent the csv via email to wes@lambdafoundry.com

The text was updated successfully, but these errors were encountered:

gerigk · 2012-06-27T12:32:55Z

one more way failing:

from pandas import *
test = read_csv('/home/arthur/transform_issue.csv')
x =test.groupby(['A','B','C'])['revenues'].first().index
test.set_index(['A','B','C'], inplace=True)
test.ix[x, 'revenues']= 999.99

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-62f629907fd1> in <module>()
      3 x =test.groupby(['A','B','C'])['revenues'].first().index
      4 test.set_index(['A','B','C'], inplace=True)
----> 5 test.ix[x, 'revenues']= 999.99

/usr/local/lib/python2.7/dist-packages/pandas-0.8.0rc2-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in __setitem__(self, key, value)
     62                 raise IndexingError('only tuples of length <= %d supported',
     63                                     self.ndim)
---> 64             indexer = self._convert_tuple(key)
     65         else:
     66             indexer = self._convert_to_indexer(key)

/usr/local/lib/python2.7/dist-packages/pandas-0.8.0rc2-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in _convert_tuple(self, key)
     71         keyidx = []
     72         for i, k in enumerate(key):
---> 73             idx = self._convert_to_indexer(k, axis=i)
     74             keyidx.append(idx)
     75         return tuple(keyidx)

/usr/local/lib/python2.7/dist-packages/pandas-0.8.0rc2-py2.7-linux-x86_64.egg/pandas/core/indexing.pyc in _convert_to_indexer(self, obj, axis)
    370                 # this is not the most robust, but...
    371                 if (isinstance(labels, MultiIndex) and
--> 372                     not isinstance(objarr[0], tuple)):
    373                     level = 0
    374                     _, indexer = labels.reindex(objarr, level=level)

IndexError: index 0 is out of bounds for axis 0 with size 0

gerigk · 2012-06-28T13:13:54Z

In case anybody wonders what that would be good for:

Say I have Gender, Date, Value as columns

I want to compute the pct_change of value and unfortunately grouped.agg(lambda x: x.pct_change() ) is very slow.
I expect calling
df['delta'] = df.values.pct_change()
ind = df.groupby('Gender').first().index
df.ix[ind, 'delta'] = np.nan
to be faster.
Since groupby kicks out dates without observations of "value" this is the smartest solution I found (otherwise I could set df.delta[df.date = min(df.date)]= np.nan).

wesm · 2012-06-28T22:05:22Z

I'll take a look, thanks

wesm · 2012-06-29T00:01:12Z

Thanks for the report, fixed these issues which were caused by some recent internal changes in MultiIndex

gerigk · 2012-06-29T08:04:13Z

from pandas import *
test = read_csv('/home/arthur/transform_issue.csv')
x =test.groupby(['A','B','C'])['revenues'].first().index
test.set_index(['A','B','C'], inplace=True)
test.ix[x]['revenues']= 999.99

print test.ix[x].shape
print test[test.revenues==999.99]

still doesn't work.

the other 2 ways of assigning are working now. is this way supposed to work, too? there is no exception/warning raised.

wesm · 2012-06-29T14:40:56Z

That won't work because test.ix[x] produces a copy

Version 0.8.0 * tag 'v0.8.0': (21 commits) RLS: version 0.8.0 DOC: release notes BUG: _get_marker_compat insufficient on matplotlib < 1.1.0 BUG: don't use local() in read_* functions, breaks sys.settrace. close pandas-dev#1547 BUG: fix Panel slice setting issue and matplotlib import issues pandas-dev#1548, pandas-dev#1533 ENH: parsers don't use tempfile ENH: implement DataFrameGroupBy.boxplot(), close pandas-dev#1507 BUG: fix MultiIndex indexing issues in pandas-dev#1537, python 2.5 api fix BUG: fix incorrect bin labels from cut when labels=False and NA present. close pandas-dev#1511 ENH: support file-like objects in ExcelFile, close pandas-dev#1529 TST: skip test raising unsortable warning on 32-bit windows, other platforms. pandas-dev#1546 BUG: raise exceptions out of trying to parse iso8601 strings TST: separated test case BUG: custom colors for bar chart pandas-dev#1540 ENH: add 'time' as inferred_type ENH: datetime.time converters for plotting BUG: fix MultiIndex segfault due to internal refactoring. close pandas-dev#1532 BUG: fix MultiIndex compatibility bugs described in pandas-dev#1534 post gutting internal array close pandas-dev#1534 BUG: parser bug when parse_dates is string pandas-dev#1544 BUG: return nameless Series and index from from_csv ...

wesm added a commit that referenced this issue Jun 29, 2012

BUG: fix MultiIndex indexing issues in #1537, python 2.5 api fix

53ae1d5

wesm closed this as completed Jun 29, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem when setting values based on MultiIndex subset #1537

Problem when setting values based on MultiIndex subset #1537

gerigk commented Jun 27, 2012

gerigk commented Jun 27, 2012

gerigk commented Jun 28, 2012

wesm commented Jun 28, 2012

wesm commented Jun 29, 2012

gerigk commented Jun 29, 2012

wesm commented Jun 29, 2012

Problem when setting values based on MultiIndex subset #1537

Problem when setting values based on MultiIndex subset #1537

Comments

gerigk commented Jun 27, 2012

gerigk commented Jun 27, 2012

gerigk commented Jun 28, 2012

wesm commented Jun 28, 2012

wesm commented Jun 29, 2012

gerigk commented Jun 29, 2012

wesm commented Jun 29, 2012