-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Allow IOErrors when attempting to retrieve default client encoding. #21531
BUG: Allow IOErrors when attempting to retrieve default client encoding. #21531
Conversation
…ing. instead of a more specific AttributeError to include the thrown IOError.
|
|
OK thanks - if you could open an issue and reference it then that would be preferable (typically PRs are only allowed without an issue if they are DOC updates). You'll also want a whatsnew note to describe the bug and reference the issue. Next minor release is 0.23.2 at this point |
Codecov Report
@@ Coverage Diff @@
## master #21531 +/- ##
=========================================
Coverage ? 92.18%
=========================================
Files ? 169
Lines ? 50739
Branches ? 0
=========================================
Hits ? 46774
Misses ? 3965
Partials ? 0
Continue to review full report at Codecov.
|
pandas/io/formats/console.py
Outdated
@@ -21,7 +21,7 @@ def detect_console_encoding(): | |||
encoding = None | |||
try: | |||
encoding = sys.stdout.encoding or sys.stdin.encoding | |||
except AttributeError: | |||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just except
both on AttributeError
and IOError
? I generally try to avoid using the broad Exception
unless we have good reason to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok, getting to pass travis might be a bit tricky.
assert detect_console_encoding() == 'foo' | ||
|
||
|
||
def test_detect_console_encoding_stdin(monkeypatch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some comments on what is being tested here.
doc/source/whatsnew/v0.23.2.txt
Outdated
@@ -65,7 +65,7 @@ Bug Fixes | |||
**I/O** | |||
|
|||
- Bug in :func:`read_csv` that caused it to incorrectly raise an error when ``nrows=0``, ``low_memory=True``, and ``index_col`` was not ``None`` (:issue:`21141`) | |||
- | |||
- Bug in :meth:`detect_client_encoding` where potential IOError goes unhandled when importing in a mod_wsgi process due to restricted access to stdout. (:issue:`21552`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks on IOError
…JayOfferdahl/pandas into get-client-encoding-on-mod-wsgi
Hello @JayOfferdahl! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on August 23, 2018 at 03:18 Hours UTC |
Thanks for the feedback, made some changes to tests to cover more cases and provide some insight as to what they're doing. Is there anything special I need to do to get this working with travis? |
looks like it might be tricky to monkeypatch how you are doing it: https://travis-ci.org/pandas-dev/pandas/jobs/394566147 does this test run locally? |
Interesting...it runs locally just fine...I'll probably need to patch stdout/stdin like the other tests and provide the property on that patched object. |
I've updated the test to mock out stdout/stdin instead of trying to change their encoding attribute. I do the same in other tests so I'm assuming this will work fine with travis. Apologies for the slow turn around on this one! |
super(MockEncoding, self).__init__() | ||
self.encoding = encoding | ||
|
||
monkeypatch.setattr('sys.{}'.format(empty), MockEncoding('')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance you saw the context manager for monkeypatch? Seems to be preferred for patching standard library items (may require a version bump for pytest dependency).
@WillAyd I've taken a look at the context manager and have updated these tests. Looks like I missed the 0.23.2 release...my own fault, of course. Should I put my note in the 0.24 whatsnew document? Not sure how you all handle bugs in non-minor release versions. Side note: I did have to upgrade pytest to 3.6.3, I'm almost certain that this will break travis since it won't have that version (maybe I'm wrong) -- where does this get specified? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JayOfferdahl no worries on timing - we tentatively have a 0.23.3 bug fix release that you can move the whatsnew entry to. Looks like you also have some LINTing errors to address
@jreback do you have any opposition to increasing the min version of pytest to 3.6? It's a relatively new release but the context manager for monkey patching I think is a useful and clean feature
['stdout', 'stdin'] | ||
]) | ||
def test_detect_console_encoding_from_stdout_stdin(monkeypatch, empty, filled): | ||
# Ensures that when sys.stdout.encoding or sys.stdin.encoding is used when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add a reference to the issue for this and the other new tests, i.e. # GH 21552
lambda: 'ascii' | ||
]) | ||
def test_detect_console_encoding_fallback_to_locale(monkeypatch, stdEncoding): | ||
monkeypatch.setattr('locale.getpreferredencoding', lambda: 'foo') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can put this in the context manager, no?
no problem bumping pytest version (separate PR) |
Inline with the class name gave the same error...ended up putting this in the docstring. This should still be just as easy to discover, I assume. |
|
||
class MockEncoding(object): | ||
""" | ||
TODO(py27): replace with mock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see the LINT errors before but maybe you got it mixed up with the one I've commented on below? I still think inline with the class
block opening would be preferable. Certainly a nit, but I haven't seen TODOs embedded in docstrings like this before
@property | ||
def encoding(self): | ||
return raiseOrReturn(self.val) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current LINT error is because you are missing a blank line here
def encoding(self): | ||
return raiseOrReturn(self.val) | ||
|
||
def raiseOrReturn(val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer PEP8 naming, so let's call this raise_or_return
assert detect_console_encoding() == filled | ||
|
||
|
||
@pytest.mark.parametrize('stdEncoding', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 so std_encoding
or simply encoding
* It might be apparent what I styleguide I use!!
Changes made. I was using the regular pylint package and not flake8, which explains the difference. Apologies, and thanks for spending so much time on this, I really appreciate it. |
return raise_or_return(self.val) | ||
|
||
|
||
def raise_or_return(val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved as an instance method of MockEncoding
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also using it down below to test when defaulting to system default -- it covers the same goal of throwing an exception like above here. I could make it a static method of MockEncoding, but I thought using that object for both cases wouldn't make sense since the encoding property is only being consumed from accesses by stdout or stdin. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd did you see this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a class method just so all of the relevant constructs stay together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 minor comments
return raise_or_return(self.val) | ||
|
||
|
||
def raise_or_return(val): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a class method just so all of the relevant constructs stay together
|
||
@pytest.mark.parametrize('std,locale', [ | ||
[MockEncoding('ascii'), lambda: 'ascii'], | ||
[MockEncoding('ascii'), lambda: raise_or_return(Exception)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of mixing usage of raise_or_return
here for each parameter just do it in the body of the function; should work for all parameters since in case of a string it returns the string anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm so long as it goes green. @jreback
@JayOfferdahl can you rebase? |
Of course, is the general work flow to just squash all of my changes down into one commit and rebase on top of master or should I merge master into this? |
Merge master in |
doc/source/whatsnew/v0.24.0.txt
Outdated
@@ -673,7 +673,7 @@ I/O | |||
|
|||
- :func:`read_html()` no longer ignores all-whitespace ``<tr>`` within ``<thead>`` when considering the ``skiprows`` and ``header`` arguments. Previously, users had to decrease their ``header`` and ``skiprows`` values on such tables to work around the issue. (:issue:`21641`) | |||
- :func:`read_excel()` will correctly show the deprecation warning for previously deprecated ``sheetname`` (:issue:`17994`) | |||
- | |||
- Bug in :meth:`detect_client_encoding` where potential ``IOError`` goes unhandled when importing in a mod_wsgi process due to restricted access to stdout. (:issue:`21552`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move to 0.23.5? @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a regression? If not, I would probably stay with 0.24.0
unless there is a compelling reason to backport-patch this bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @jreback
Any update here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that this fell through the cracks! This looks good to merge from me as well.
cc @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have other console tests that can go in here?
…JayOfferdahl/pandas into get-client-encoding-on-mod-wsgi
Thanks @JayOfferdahl ! |
Thank you, and thanks for the help throughout the process! |
When using mod_wsgi, access to sys.stdout is restricted by default. To handle this case, catch a base exception instead of a more specific AttributeError during this process to include the thrown IOError by mod_wsgi.