Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed image/audio data in dl_and_prepare parquet #4987

Merged
merged 1 commit into from
Sep 16, 2022

Conversation

lhoestq
Copy link
Member

@lhoestq lhoestq commented Sep 16, 2022

Embed the bytes of the image or audio files in the Parquet files directly, instead of having a "path" that points to a local file.

Indeed Parquet files are often used to share data or to be used by workers that may not have access to the local files.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 16, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@mariosasko mariosasko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

@lhoestq lhoestq merged commit 81d5900 into main Sep 16, 2022
@lhoestq lhoestq deleted the embed-image-and-audio-in-dl_and_pp-parquet branch September 16, 2022 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants