Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.read_csv with string content (not bytes) outputs whole content in exception message #6853

Closed
2 tasks done
2-5 opened this issue Feb 13, 2023 · 0 comments · Fixed by #6917
Closed
2 tasks done

pl.read_csv with string content (not bytes) outputs whole content in exception message #6853

2-5 opened this issue Feb 13, 2023 · 0 comments · Fixed by #6917
Labels
bug Something isn't working python Related to Python Polars

Comments

@2-5
Copy link
Contributor

2-5 commented Feb 13, 2023

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

If you pass a str to pl.read_csv, it assumes it's a CSV file path. At the same time, pl.read_csv accepts a bytes and it treats it as content.

One common user mistake is to pass in a str with the CSV contents, instead of a bytes. In this case, Polars outputs the str back in the exception message, assuming it's a path. But if the str is large (megabytes), it becomes very hard to diagnose what the actual problem is, since you can't see the beginning of the actual error message.

Reproducible example

import polars as pl

content = "\n".join(
    str(i) for i in range(1000)
)
# content = content.encode("ascii")

df = pl.read_csv(content)
print(df)

Outputs:

Traceback (most recent call last):
  File "test14.py", line 8, in <module>
    df = pl.read_csv(content)
  File ".venv\lib\site-packages\polars\io.py", line 377, in read_csv
    df = DataFrame._read_csv(
  File ".venv\lib\site-packages\polars\internals\dataframe\frame.py", line 762, in _read_csv
    self._df = PyDataFrame.read_csv(
FileNotFoundError: No such file or directory: 0
1
2
3
4
5
6
7
8
9
...
999
1000

Expected behavior

Polars trims the assumed path in the error message of pl.read_csv to a sane value (1000 chars, ...)

Installed versions

---Version info---
Polars: 0.16.4
Index type: UInt32
Platform: Windows-10-10.0.19045-SP0
Python: 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)]
---Optional dependencies---
pyarrow: 11.0.0
pandas: 1.5.3
numpy: 1.23.5
fsspec: 2022.11.0
connectorx: <not installed>
xlsx2csv: <not installed>
deltalake: <not installed>
matplotlib: 3.6.3
@2-5 2-5 added bug Something isn't working python Related to Python Polars labels Feb 13, 2023
@2-5 2-5 changed the title pl.scan_csv with string content (not bytes) outputs whole content in exception message pl.read_csv with string content (not bytes) outputs whole content in exception message Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant