Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results building a DataFrame from a dict of Series with MultiIndexes #1727

Closed
grsr opened this issue Aug 3, 2012 · 1 comment
Labels
Milestone

Comments

@grsr
Copy link

grsr commented Aug 3, 2012

I am trying to build a DataFrame from a dict of Series objects which have (not necessarily exactly matching) MultiIndex indices, sometimes I don't get any results for some particular data point and so I create an empty Series object and add that to the dict, as I need there to be a column present even without any data (later I convert all NAs to 0). This seems to work sometimes, but other times I get an error message that implies all the Series need hierarchical indices, it seems to depend on the order in which the Series are added to the dict. See example session below, the first time I create the DataFrame it behaves just as I'd like, giving me a df index that is the union of all the indices in the populated Series and supplying NaNs wherever there is no data, but the second time it blows up. Perhaps I shouldn't be relying on this behaviour but it seems that the results should at least be consistent.

Any tips on how to solve this in a cleaner way also very welcome. Thanks.

In [292]: pandas.__version__
Out[292]: '0.8.1'

In [293]: s1 = Series([1,2,3,4], index=MultiIndex.from_tuples([(1,2),(1,3),(2,2),(2,4)]))

In [294]: s2 = Series([1,2,3,4], index=MultiIndex.from_tuples([(1,2),(1,3),(3,2),(3,4)]))

In [295]: s3 = Series()

In [296]: df = DataFrame.from_dict({'foo':s1, 'bar':s2, 'baz':s3})

In [297]: df
Out[297]: 
     bar  baz  foo
1 2    1  NaN    1
  3    2  NaN    2
2 2  NaN  NaN    3
  4  NaN  NaN    4
3 2    3  NaN  NaN
  4    4  NaN  NaN

In [298]: df = DataFrame.from_dict({'foo':s1, 'baz':s3, 'bar':s2})

... stacktrace

TypeError: can only call with other hierarchical index objects
@wesm wesm closed this as completed in 29846da Aug 12, 2012
@wesm
Copy link
Member

wesm commented Aug 12, 2012

fixed so this works. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants