Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unbound local with bad engine #16511

Merged
merged 3 commits into from
May 31, 2017

Conversation

jtratner
Copy link
Contributor

@jtratner jtratner commented May 26, 2017

This was so small I figured simpler to put up a PR rather than issue then PR. :)

Previously, passing a bad engine to read_csv gave an less-than-informative UnboundLocalError:

Traceback (most recent call last):
  File "example_test.py", line 9, in <module>
    pd.read_csv(tfp.name, engine='pyt')
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 655, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 405, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 762, in __init__
    self._make_engine(self.engine)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 972, in _make_engine
    self._engine = klass(self.f, **self.options)
UnboundLocalError: local variable 'klass' referenced before assignment

Now it gives a much nicer ValueError:

Traceback (most recent call last):
  File "example_test.py", line 9, in <module>
    pd.read_csv(fp, engine='pyt')
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 655, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 405, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 762, in __init__
    self._make_engine(self.engine)
  File "/Users/jtratner/pandas/pandas/io/parsers.py", line 974, in _make_engine
    ' or "python-fwf")' % engine)
ValueError: Unknown engine: 'pyt' (valid are "c", "python", or "python-fwf")
  • tests added / passed - added test that correct ValueError is generated
  • passes git diff upstream/master --name-only -- '*.py' | flake8 --diff
  • whatsnew entry

I was not sure where to stick the test or the whatsnew entry (or if a whatsnew entry is really necessary), so please tell me if I should move it elsewhere.

Cheers!

@jtratner
Copy link
Contributor Author

(or if you want to do this a totally different way / different error I can make changes or close)

@jtratner jtratner force-pushed the fix-unbound-local-with-bad-engine branch from 18c3012 to 6ba36e7 Compare May 26, 2017 05:51
@@ -83,6 +83,8 @@ Performance Improvements
Bug Fixes
~~~~~~~~~

- Passing an invalid engine to `read_csv` now raises an informative ValueError rather than UnboundLocalError. (:issue:`16511`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:func:read_csv is better, double back-ticks on ValueError.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u can put in0.20.2

@@ -99,3 +102,14 @@ def read_table(self, *args, **kwds):
kwds = kwds.copy()
kwds['engine'] = self.engine
return read_table(*args, **kwds)


class TestParameterValidation(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tm.ensure_clean() as path

Copy link
Member

@gfyoung gfyoung May 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this test into common.py (in same directory). This base test class should not be touched for organizational purposes.

Copy link
Member

@gfyoung gfyoung May 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why are we writing a round trip test? This can be much simpler:

data = "a\n1"
msg = "Unknown engine"
with tm.assert_raises_regex(ValueError, msg):
  read_csv(StringIO(data), engine='pyt')  # don't use self.read_csv because that will override the engine parameter

Oh and yes, use tm.assert_raises_regex instead of the pytest.raises(...) (pandas regex error message matching is a little more compact).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring for tm.assert_raises_regex says to use pytest.raises

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so okay that this will get run once for every engine, even though it's the same test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! You're right. We changed our minds about that. Mind fixing the documentation on that in a separate commit / PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so okay that this will get run once for every engine, even though it's the same test?

Actually, better idea: move it to test_common.py (the directory above)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. minor comments.

def test_unknown_engine(self):
with tempfile.NamedTemporaryFile() as fp:
df = tm.makeDataFrame()
df.to_csv(fp.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung good location for this type of test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. There shouldn't be any tests in this file. I made a comment here about it.

@jreback jreback added Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv labels May 26, 2017
@jtratner jtratner force-pushed the fix-unbound-local-with-bad-engine branch from 6ba36e7 to 26a3300 Compare May 26, 2017 15:00
@jtratner
Copy link
Contributor Author

made all the requested changes - thanks for the review @jreback !

@codecov
Copy link

codecov bot commented May 26, 2017

Codecov Report

Merging #16511 into master will decrease coverage by 0.36%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
- Coverage   90.79%   90.43%   -0.37%     
==========================================
  Files         161      161              
  Lines       51063    51046      -17     
==========================================
- Hits        46363    46162     -201     
- Misses       4700     4884     +184
Flag Coverage Δ
#multiple 88.27% <100%> (-0.37%) ⬇️
#single 40.16% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.33% <100%> (-0.33%) ⬇️
pandas/io/formats/excel.py 74.24% <0%> (-22.41%) ⬇️
pandas/io/excel.py 62.31% <0%> (-18.33%) ⬇️
pandas/conftest.py 95.83% <0%> (-0.6%) ⬇️
pandas/util/testing.py 80.79% <0%> (-0.2%) ⬇️
pandas/core/series.py 94.71% <0%> (-0.19%) ⬇️
pandas/core/generic.py 92.16% <0%> (-0.1%) ⬇️
pandas/core/resample.py 96.08% <0%> (-0.02%) ⬇️
pandas/core/reshape/pivot.py 95.08% <0%> (ø) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...26a3300. Read the comment docs.

@codecov
Copy link

codecov bot commented May 26, 2017

Codecov Report

Merging #16511 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16511      +/-   ##
==========================================
+ Coverage   90.79%   90.79%   +<.01%     
==========================================
  Files         161      161              
  Lines       51063    51064       +1     
==========================================
+ Hits        46363    46366       +3     
+ Misses       4700     4698       -2
Flag Coverage Δ
#multiple 88.63% <100%> (ø) ⬆️
#single 40.15% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/parsers.py 95.66% <100%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.33% <0%> (+0.09%) ⬆️
pandas/compat/__init__.py 62.22% <0%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...7c5f2c4. Read the comment docs.

Previously had an UnboundLocalError - no fun!
@jtratner jtratner force-pushed the fix-unbound-local-with-bad-engine branch from 26a3300 to 9a4f1a7 Compare May 26, 2017 15:38
@@ -969,6 +969,9 @@ def _make_engine(self, engine='c'):
klass = PythonParser
elif engine == 'python-fwf':
klass = FixedWidthFieldParser
else:
raise ValueError('Unknown engine: %r (valid are "c", "python",'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about valid options are... instead of valid are...

@gfyoung
Copy link
Member

gfyoung commented May 26, 2017

@jtratner : Thanks for this! Seems like we need some more fuzzy-testing for read_csv...

One minor comment about the actual error message, and two bigger ones regarding the actual test.

@@ -39,6 +39,9 @@ Bug Fixes

- Bug in using ``pathlib.Path`` or ``py.path.local`` objects with io functions (:issue:`16291`)
- Bug in ``DataFrame.update()`` with ``overwrite=False`` and ``NaN values`` (:issue:`15593`)
- Passing an invalid engine to :func:`read_csv` now raises an informative
ValueError rather than UnboundLocalError. (:issue:`16511`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double backtics on ValueError and UnboundLocalError

@@ -969,6 +969,9 @@ def _make_engine(self, engine='c'):
klass = PythonParser
elif engine == 'python-fwf':
klass = FixedWidthFieldParser
else:
raise ValueError('Unknown engine: %r (valid are "c", "python",'
' or "python-fwf")' % engine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use .format(...)

@jtratner
Copy link
Contributor Author

okay, covered everybody's comments and moved tests again

@jtratner
Copy link
Contributor Author

@gfyoung @jreback - if either of you have a moment to look - all tests are green and I've made your changes.

@gfyoung
Copy link
Member

gfyoung commented May 31, 2017

@jtratner : LGTM!

@jreback jreback added this to the 0.20.2 milestone May 31, 2017
@jreback jreback merged commit 9b0ea41 into pandas-dev:master May 31, 2017
@jreback
Copy link
Contributor

jreback commented May 31, 2017

thanks!

@jtratner jtratner deleted the fix-unbound-local-with-bad-engine branch May 31, 2017 13:56
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Jun 1, 2017
TomAugspurger pushed a commit that referenced this pull request Jun 4, 2017
Kiv pushed a commit to Kiv/pandas that referenced this pull request Jun 11, 2017
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants