Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA][Spark 3.5.0] Add support for lz4raw compression in Parquet files #9156

Open
andygrove opened this issue Aug 31, 2023 · 2 comments
Open
Labels
audit_3.5.0 cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request

Comments

@andygrove
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Spark 3.5.0 adds support for lz4raw compression in Parquet files. cuDF does not support this, so we cannot support this on GPU.

Describe the solution you'd like
Add support for this compression codec.

Describe alternatives you've considered

Additional context

@andygrove andygrove added feature request New feature or request ? - Needs Triage Need team to review and classify audit_3.5.0 labels Aug 31, 2023
@mattahrens mattahrens added the cudf_dependency An issue or PR with this label depends on a new feature in cudf label Sep 6, 2023
@mattahrens
Copy link
Collaborator

Need to document limitation for lz4raw compression. File issues for downstream teams for support.

Ref: https://issues.apache.org/jira/browse/PARQUET-2032.

@razajafri
Copy link
Collaborator

Added an issue in cuda-hpc-libraries project in gitlab-master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit_3.5.0 cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants