Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_parquet from multiple files #2794

Closed
Gabriel-ROBIN opened this issue Feb 28, 2022 · 5 comments
Closed

read_parquet from multiple files #2794

Gabriel-ROBIN opened this issue Feb 28, 2022 · 5 comments

Comments

@Gabriel-ROBIN
Copy link

What language are you using?

Python 3.8,

What version of polars are you using?

0.13.4

What operating system are you using polars on?

Redhat 7.5

Describe your bug.

Form the doc, we can see that it is possible to read a list of parquet files. I did not make it work.

What are the steps to reproduce the behavior?

Example

import polars as pl

df1 = pl.DataFrame({"bar": [1, 2, 3], "foo": [3, 2, 1]})
df2 = pl.DataFrame({"bar": [4, 5, 6], "foo": [3, 2, 1]})
df1.to_parquet('./abc.parquet')
df2.to_parquet('./def.parquet')

pl.read_parquet([
    './abc.parquet',
    './def.parquet'
])

What is the actual behavior?

Fails with
TypeError: Object does not have a .read() method.

What is the expected behavior?

abc.parquet and def.parquet are concatenate

@ritchie46
Copy link
Member

You can only read multiple files at once using glob patters. E.g.

pl.read_parquet(
    './*.parquet',
)

If you want to read without glob patters, you need to call read_parquet separately and concat the DataFrames.

@Gabriel-ROBIN
Copy link
Author

Ok, that's what I understood, but in this case there is a typo in the doc string right ? It is said that source could be a List[str]

pl.read_parquet(
    source: Union[str, List[str], pathlib.Path, BinaryIO, _io.BytesIO, bytes],
    columns: Union[List[int], List[str], NoneType] = None,
    ...
) -> polars.internals.frame.DataFrame

@ritchie46
Copy link
Member

Ah, check. Thanks for pointing that out.

@ritchie46
Copy link
Member

Fixed

@alf239
Copy link

alf239 commented Jan 4, 2024

Just for the next one bumping into this ticket: it's now added with #13044

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants