Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store tables per default as parquet files #452

Merged
merged 1 commit into from
Jul 12, 2024
Merged

Store tables per default as parquet files #452

merged 1 commit into from
Jul 12, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Jul 12, 2024

As we now support storing tables as parquet files in audb>=1.8.0, and have published the first databases, I would propose to switch to parquet for storing tables in the next audformat release.

This pull request changes the default values of the storage_format argument in audformat.Database.save() and audformat.Table.save() to "parquet".

image

image

@hagenw hagenw marked this pull request as draft July 12, 2024 13:43
@hagenw hagenw marked this pull request as ready for review July 12, 2024 13:49
@hagenw
Copy link
Member Author

hagenw commented Jul 12, 2024

Seems our tests were already well prepared for this step, as I didn't had to change anything.

Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically changes the derfault table storage format to the newer parquet in two locations:

  • the datatabase level
  • at the level of the table

I do not know where the Table.save is used other than by Database.save so I cannot judge whether this is functionally necessary: grepping the source code of this repo and the audb repo suggests that this is never done.

So my understanding is that one could get away without defaulting to parquet in Table - still it is better at the level of API consistency.

I will approve this right away.

@hagenw
Copy link
Member Author

hagenw commented Jul 12, 2024

So my understanding is that one could get away without defaulting to parquet in Table - still it is better at the level of API consistency.

Exactly, we don't have to change it in audformat.Table.save(), but I also thought that it would be more consistent this way.

@hagenw hagenw merged commit 0f0b069 into main Jul 12, 2024
10 checks passed
@hagenw hagenw deleted the parquet-default branch July 12, 2024 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants