Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't save empty Series or DataFrame to hdf5 with HDFStore #1707

Closed
eriknw opened this issue Jul 30, 2012 · 1 comment
Closed

Can't save empty Series or DataFrame to hdf5 with HDFStore #1707

eriknw opened this issue Jul 30, 2012 · 1 comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@eriknw
Copy link
Contributor

eriknw commented Jul 30, 2012

With pandas 0.8.1 (and pytables 2.3.1), trying to save an empty Series or DataFrame when using HDFStore results in an exception after some (but not all) data has been written to the hdf5 file.

from pandas import DataFrame, Series, HDFStore

# These are all empty
s0 = Series()
s1 = Series(name='myseries')
df0 = DataFrame()
df1 = DataFrame(index=['a', 'b', 'c'])
df2 = DataFrame(columns=['d', 'e', 'f'])
store = HDFStore('myfile.h5')

# These all fail
try:
    store['s0'] = s0
except ValueError:
    print 'Failed to write s0'

try:
    store['s1'] = s1
except ValueError:
    print 'Failed to write s1'

try:
    store['df0'] = df0
except ValueError:
    print 'Failed to write df0'

try:
    store['df1'] = df1
except ValueError:
    print 'Failed to write df1'

try:
    store['df2'] = df2
except ValueError:
    print 'Failed to write df2'

Here is the traceback:

ValueError                                Traceback (most recent call last)
/usr/lib/python2.7/dist-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
    176             else:
    177                 filename = fname
--> 178             __builtin__.execfile(filename, *where)

/home/erikw/pandas_hdf5_fail.py in <module>()
     30 
     31 try:
---> 32     store['df2'] = df2
     33 except ValueError:
     34     print 'Failed to write df2'

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in __setitem__(self, key, value)
    184 
    185     def __setitem__(self, key, value):
--> 186         self.put(key, value)
    187 
    188     def __contains__(self, key):

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in put(self, key, value, table, append, compression)
    341         self._write_to_group(key, value, table=table, append=append,
--> 342                              comp=compression)
    343 
    344     def _get_handler(self, op, kind):

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in _write_to_group(self, key, value, table, append, comp)
    408             wrapper = lambda value: handler(group, value)
    409 
--> 410         wrapper(value)
    411         group._v_attrs.pandas_type = kind
    412 

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in <lambda>(value)
    406 
    407             handler = self._get_handler(op='write', kind=kind)
--> 408             wrapper = lambda value: handler(group, value)
    409 
    410         wrapper(value)

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in _write_frame(self, group, df)
    489 
    490     def _write_frame(self, group, df):
--> 491         self._write_block_manager(group, df._data)
    492 
    493     def _read_frame(self, group, where=None):

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in _write_block_manager(self, group, data)
    500         group._v_attrs.ndim = data.ndim
    501         for i, ax in enumerate(data.axes):
--> 502             self._write_index(group, 'axis%d' % i, ax)
    503 
    504         # Supporting mixed-type DataFrame objects...nontrivial

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in _write_index(self, group, key, index)
    571         else:
    572             if len(index) == 0:
--> 573                 raise ValueError('Can not write empty structure, '
    574                                  'axis length was 0')
    575 

ValueError: Can not write empty structure, axis length was 0

And here I show an issue that arises from only some of the data being written:

In [6]: store.keys()
Out[6]: ['s1', 's0', 'df1', 'df0', 'df2']

In [7]: store['df0']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-7-b5d10da56de7> in <module>()
----> 1 store['df0']

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in __getitem__(self, key)
    181 
    182     def __getitem__(self, key):
--> 183         return self.get(key)
    184 
    185     def __setitem__(self, key, value):

/usr/lib/pymodules/python2.7/pandas/io/pytables.pyc in get(self, key)
    283             return self._read_group(group)
    284         except (exc_type, AttributeError):
--> 285             raise KeyError('No object named %s in the file' % key)
    286 
    287     def select(self, key, where=None):

KeyError: 'No object named df0 in the file'

Oh, and just for the record, all tests in "io/tests/test_pytables.py" succeed for me.

@wesm wesm closed this as completed in 603e5ae Aug 12, 2012
@wesm
Copy link
Member

wesm commented Aug 12, 2012

fixed this, though required a bit of a hackjob (pytables doesn't like zero-length objects)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

2 participants