Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Make ExcelWriter & ExcelFile contextmanagers #4933

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 12 additions & 14 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1672,14 +1672,13 @@ The Panel class also has a ``to_excel`` instance method,
which writes each DataFrame in the Panel to a separate sheet.

In order to write separate DataFrames to separate sheets in a single Excel file,
one can use the ExcelWriter class, as in the following example:
one can pass an :class:`~pandas.io.excel.ExcelWriter`.

.. code-block:: python

writer = ExcelWriter('path_to_file.xlsx')
df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')
writer.save()
with ExcelWriter('path_to_file.xlsx') as writer:
df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')

.. _io.excel.writers:

Expand All @@ -1693,14 +1692,13 @@ Excel writer engines
1. the ``engine`` keyword argument
2. the filename extension (via the default specified in config options)

By default ``pandas`` only supports
`openpyxl <http://packages.python.org/openpyxl/>`__ as a writer for ``.xlsx``
and ``.xlsm`` files and `xlwt <http://www.python-excel.org/>`__ as a writer for
``.xls`` files. If you have multiple engines installed, you can change the
default engine via the ``io.excel.xlsx.writer`` and ``io.excel.xls.writer``
options.
By default, ``pandas`` uses `openpyxl <http://packages.python.org/openpyxl/>`__
for ``.xlsx`` and ``.xlsm`` files and `xlwt <http://www.python-excel.org/>`__
for ``.xls`` files. If you have multiple engines installed, you can set the
default engine through :ref:`setting the config options <basics.working_with_options>`
``io.excel.xlsx.writer`` and ``io.excel.xls.writer``.

For example if the optional `XlsxWriter <http://xlsxwriter.readthedocs.org>`__
For example if the `XlsxWriter <http://xlsxwriter.readthedocs.org>`__
module is installed you can use it as a xlsx writer engine as follows:

.. code-block:: python
Expand All @@ -1712,8 +1710,8 @@ module is installed you can use it as a xlsx writer engine as follows:
writer = ExcelWriter('path_to_file.xlsx', engine='xlsxwriter')

# Or via pandas configuration.
from pandas import set_option
set_option('io.excel.xlsx.writer', 'xlsxwriter')
from pandas import options
options.io.excel.xlsx.writer = 'xlsxwriter'

df.to_excel('path_to_file.xlsx', sheet_name='Sheet1')

Expand Down
2 changes: 2 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ Improvements to existing features
- ``read_json`` now raises a (more informative) ``ValueError`` when the dict
contains a bad key and ``orient='split'`` (:issue:`4730`, :issue:`4838`)
- ``read_stata`` now accepts Stata 13 format (:issue:`4291`)
- ``ExcelWriter`` and ``ExcelFile`` can be used as contextmanagers.
(:issue:`3441`, :issue:`4933`)

API Changes
~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3453,7 +3453,7 @@ def unstack(self, level=-1):
See also
--------
DataFrame.pivot : Pivot a table based on column values.
DataFrame.stack : Pivot a level of the column labels (inverse operation
DataFrame.stack : Pivot a level of the column labels (inverse operation
from `unstack`).

Examples
Expand Down
24 changes: 23 additions & 1 deletion pandas/io/excel.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from pandas import json
from pandas.compat import map, zip, reduce, range, lrange, u, add_metaclass
from pandas.core import config
from pandas.core.common import pprint_thing, PandasError
from pandas.core.common import pprint_thing
import pandas.compat as compat
from warnings import warn

Expand Down Expand Up @@ -260,6 +260,17 @@ def _parse_excel(self, sheetname, header=0, skiprows=None, skip_footer=0,
def sheet_names(self):
return self.book.sheet_names()

def close(self):
"""close path_or_buf if necessary"""
if hasattr(self.path_or_buf, 'close'):
self.path_or_buf.close()

def __enter__(self):
return self

def __exit__(self, exc_type, exc_value, traceback):
self.close()


def _trim_excel_header(row):
# trim header row so auto-index inference works
Expand Down Expand Up @@ -408,6 +419,17 @@ def check_extension(cls, ext):
else:
return True

# Allow use as a contextmanager
def __enter__(self):
return self

def __exit__(self, exc_type, exc_value, traceback):
self.close()

def close(self):
"""synonym for save, to make it more file-like"""
return self.save()


class _OpenpyxlWriter(ExcelWriter):
engine = 'openpyxl'
Expand Down
27 changes: 27 additions & 0 deletions pandas/io/tests/test_excel.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,18 @@ def test_xlsx_table(self):
tm.assert_frame_equal(df4, df.ix[:-1])
tm.assert_frame_equal(df4, df5)

def test_reader_closes_file(self):
_skip_if_no_xlrd()
_skip_if_no_openpyxl()

pth = os.path.join(self.dirpath, 'test.xlsx')
f = open(pth, 'rb')
with ExcelFile(f) as xlsx:
# parses okay
df = xlsx.parse('Sheet1', index_col=0)

self.assertTrue(f.closed)


class ExcelWriterBase(SharedItems):
# Base class for test cases to run with different Excel writers.
Expand Down Expand Up @@ -310,6 +322,21 @@ def test_excel_sheet_by_name_raise(self):

self.assertRaises(xlrd.XLRDError, xl.parse, '0')

def test_excelwriter_contextmanager(self):
ext = self.ext
pth = os.path.join(self.dirpath, 'testit.{0}'.format(ext))

with ensure_clean(pth) as pth:
with ExcelWriter(pth) as writer:
self.frame.to_excel(writer, 'Data1')
self.frame2.to_excel(writer, 'Data2')

with ExcelFile(pth) as reader:
found_df = reader.parse('Data1')
found_df2 = reader.parse('Data2')
tm.assert_frame_equal(found_df, self.frame)
tm.assert_frame_equal(found_df2, self.frame2)

def test_roundtrip(self):
_skip_if_no_xlrd()
ext = self.ext
Expand Down