Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.12.0 concat() error #4583

Closed
bluefir opened this issue Aug 16, 2013 · 8 comments
Closed

0.12.0 concat() error #4583

bluefir opened this issue Aug 16, 2013 · 8 comments

Comments

@bluefir
Copy link

bluefir commented Aug 16, 2013

I run the following code on a list of DataFrames

columns = None
for portfolio_data in portfolio_data_list:
    print(portfolio_data.shape)
    if columns is None:
        columns = portfolio_data.columns
    else:
        print(all(portfolio_data.columns == columns))
portfolio_data = pd.concat(portfolio_data_list)

and get the following

(15494, 12)
(15534, 12)
True
(15584, 12)
True
(15557, 12)
True
(15555, 12)
True
(15555, 12)
True
(15600, 12)
True
Traceback (most recent call last):
File "M:/Projects/PortfolioAnalytics/Code/Python/attribution/TopContributors.py", line 1028, in
portfolio_data = pd.concat(portfolio_data_list)
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 878, in concat
verify_integrity=verify_integrity)
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 929, in init
obj.consolidate(inplace=True)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 770, in consolidate
self._consolidate_inplace()
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 752, in _consolidate_inplace
self._data = self._protect_consolidate(f)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 790, in _protect_consolidate
result = f()
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 751, in
f = lambda: self._data.consolidate()
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1638, in consolidate
return BlockManager(new_blocks, self.axes)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1001, in init
self._verify_integrity()
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1245, in _verify_integrity
tot_items))
AssertionError: Number of manager items must equal union of block items

manager items: 12, # tot_items: 11

Process finished with exit code 1

What's going wrong?

@jreback
Copy link
Contributor

jreback commented Aug 16, 2013

try with ignore_index=True. also post df.info() on one of these frames

@bluefir
Copy link
Author

bluefir commented Aug 16, 2013

The trouble was caused by using DataFrame.convert_objects(copy=False). Somehow it screws up the DataFrame. Using convert_objects(), i.e. creating a copy, works just fine. So, problem with convert_objects(copy=False), not with concat().

@bluefir bluefir closed this as completed Aug 16, 2013
@jreback
Copy link
Contributor

jreback commented Aug 16, 2013

yes, don't do that! (that's really an internal flag, in practice no need to use it), maybe I'll take it out

@bluefir
Copy link
Author

bluefir commented Aug 16, 2013

Please! It's confusing. Any chance of getting inplace= flag in convert_objects()? For big frames, that can save quite a bit of memory.

@jreback
Copy link
Contributor

jreback commented Aug 16, 2013

that's a bad idea for a lot of reasons
you should normally never even need this method
what are you doing?

@bluefir
Copy link
Author

bluefir commented Aug 16, 2013

Haha. Well, I have bool columns that happen to contain NaNs and, as such, become objects. First of all, trying to save them into an hdf table would fail. Second, for my purposes NaN can be treated as False. So, I fill NaNs with False. However, dtype for the column would still remain object unless I do convert_objects() and get my bool column type.

@jreback
Copy link
Contributor

jreback commented Aug 16, 2013

ok in that case just do convert_objects on the column itself and assign to the frame

@bluefir
Copy link
Author

bluefir commented Aug 16, 2013

Ok. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants