-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable optimized Parquet writer by default in Delta Lake #12757
Conversation
Looking CI failure.
|
1a0659e
to
52a9a4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look ok.
Do the Databricks product tests run automatically ?
Could you also add some rough benchmark results confirming improvement with new writer ?
New writer is an improvement regardless of perf numbers, because it avoids JNI for compression, and thus avoids OOM due to GCLocker.
They still don't exist, @findinpath @alexjo2144 are working on this. |
Is there a reason to leave the possibility to use the old writer ? |
just in case. we can remove it later. |
52a9a4d
to
4d31bb6
Compare
Thank you! |
Description
Enable optimized Parquet writer by default in Delta Lake
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: