Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io/parsers: ensure decimal is str on PythonParser #29743

Merged
merged 5 commits into from
Nov 22, 2019
Merged

io/parsers: ensure decimal is str on PythonParser #29743

merged 5 commits into from
Nov 22, 2019

Conversation

fsouza
Copy link
Contributor

@fsouza fsouza commented Nov 20, 2019

@fsouza fsouza marked this pull request as ready for review November 20, 2019 15:59
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks pretty good some minor edits

CI failure is unrelated if you merge master should be resolved

@@ -2344,6 +2344,9 @@ def __init__(self, f, **kwds):
if len(self.decimal) != 1:
raise ValueError("Only length-1 decimal markers supported")

if isinstance(self.decimal, bytes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you decode before assigning to decimal instead?

@pytest.mark.parametrize("add_footer", [True, False])
def test_skipfooter_with_decimal(python_parser_only, add_footer):
@pytest.mark.parametrize(
"add_footer, decimal", [(True, "#"), (False, "#"), (True, b"#"), (False, b"#")],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"add_footer, decimal", [(True, "#"), (False, "#"), (True, b"#"), (False, b"#")],
@pytest.mark.parametrize("add_footer", [True, False])
@pytest.mark.parametrize("decimal", ["#", b"#"])

Can just stack the decorates instead of manually creating tuples

@WillAyd
Copy link
Member

WillAyd commented Nov 20, 2019

Also can you add a whatsnew note for v1.0.0?

@WillAyd WillAyd added the IO CSV read_csv, to_csv label Nov 20, 2019
@fsouza
Copy link
Contributor Author

fsouza commented Nov 20, 2019

@WillAyd thanks for the feedback! 🎉

Just pushed the changes you suggested and rebased master, please take another look.

@@ -2276,7 +2276,11 @@ def __init__(self, f, **kwds):

self.dtype = kwds["dtype"]
self.thousands = kwds["thousands"]
self.decimal = kwds["decimal"]

if isinstance(kwds["decimal"], bytes):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't accept bytes as parameters, they must be strings, this is just a mistake.

Copy link
Contributor Author

@fsouza fsouza Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh in that case the fix is just the one in line 491? x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, decimal is almost always bytes:

decimal=b".",

I agree it's a mistake though, that's why I opened #29653 (and we internally have a workaround where we patch pandas so we can proceed with our Python 3 migration).

Let me know if I should just go ahead and drop support for bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea just go ahead and remove this block

@@ -2276,7 +2276,11 @@ def __init__(self, f, **kwds):

self.dtype = kwds["dtype"]
self.thousands = kwds["thousands"]
self.decimal = kwds["decimal"]

if isinstance(kwds["decimal"], bytes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea just go ahead and remove this block

@@ -568,7 +568,7 @@ def parser_f(
# Quoting, Compression, and File Format
compression="infer",
thousands=None,
decimal=b".",
decimal=".",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you annotate this while modifying? Should just be

decimal: str = "."

@WillAyd WillAyd added this to the 1.0 milestone Nov 21, 2019
@fsouza
Copy link
Contributor Author

fsouza commented Nov 21, 2019

ptal @WillAyd @jreback

Thanks for the feedback!

@jreback jreback merged commit f93e4df into pandas-dev:master Nov 22, 2019
@jreback
Copy link
Contributor

jreback commented Nov 22, 2019

thanks @fsouza

@fsouza
Copy link
Contributor Author

fsouza commented Nov 22, 2019

🎉

keechongtan added a commit to keechongtan/pandas that referenced this pull request Nov 25, 2019
…ndexing-1row-df

* upstream/master: (185 commits)
  ENH: add BooleanArray extension array (pandas-dev#29555)
  DOC: Add link to dev calendar and meeting notes (pandas-dev#29737)
  ENH: Add built-in function for Styler to format the text displayed for missing values (pandas-dev#29118)
  DEPR: remove statsmodels/seaborn compat shims (pandas-dev#29822)
  DEPR: remove Index.summary (pandas-dev#29807)
  DEPR: passing an int to read_excel use_cols (pandas-dev#29795)
  STY: fstrings in io.pytables (pandas-dev#29758)
  BUG: Fix melt with mixed int/str columns (pandas-dev#29792)
  TST: add test for ffill/bfill for non unique multilevel (pandas-dev#29763)
  Changed description of parse_dates in read_excel(). (pandas-dev#29796)
  BUG: pivot_table not returning correct type when margin=True and aggfunc='mean'  (pandas-dev#28248)
  REF: Create _lib/window directory (pandas-dev#29817)
  Fixed small mistake (pandas-dev#29815)
  minor cleanups (pandas-dev#29798)
  DEPR: enforce deprecations in core.internals (pandas-dev#29723)
  add test for unused level raises KeyError (pandas-dev#29760)
  Add documentation linking to sqlalchemy (pandas-dev#29373)
  io/parsers: ensure decimal is str on PythonParser (pandas-dev#29743)
  Reenabled no-unused-function (pandas-dev#29767)
  CLN:F-string in pandas/_libs/tslibs/*.pyx (pandas-dev#29775)
  ...

# Conflicts:
#	pandas/tests/frame/indexing/test_indexing.py
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_fwf raises BytesWarning when running with -bb
3 participants