Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export_to_parquet.py: custom input support #26

Open
benoistlaurent opened this issue Aug 18, 2023 · 0 comments
Open

export_to_parquet.py: custom input support #26

benoistlaurent opened this issue Aug 18, 2023 · 0 comments

Comments

@benoistlaurent
Copy link

benoistlaurent commented Aug 18, 2023

Input files & directories are hard-coded in export_to_parquet.py.
Not handy when wrote files in tmp...

❯❯  python scripts/export_to_parquet.py
2023-08-18 16:14:50 INFO /Users/laurent/Downloads/mdws/scripts/export_to_parquet.py
2023-08-18 16:14:50 INFO Convert datasets to parquet format.
Traceback (most recent call last):
  File "/Users/laurent/Downloads/mdws/scripts/export_to_parquet.py", line 87, in <module>
    tmp_df = pd.read_csv(name, sep="\t", dtype={"dataset_id": str})
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/laurent/.virtualenvs/mdws/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 912, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/laurent/.virtualenvs/mdws/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 577, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/laurent/.virtualenvs/mdws/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1407, in __init__
    self._engine = self._make_engine(f, self.engine)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/laurent/.virtualenvs/mdws/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1661, in _make_engine
    self.handles = get_handle(
                   ^^^^^^^^^^^
  File "/Users/laurent/.virtualenvs/mdws/lib/python3.11/site-packages/pandas/io/common.py", line 859, in get_handle
    handle = open(
             ^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/zenodo_datasets.tsv'

Source code export_to_parquet.py#L85:

for repository in ["zenodo", "figshare", "osf"]:
        name = f"data/{repository}_datasets.tsv"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant