Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parquet ecryption functionality into presto #17881

Merged

Conversation

shangxinli
Copy link
Collaborator

@shangxinli shangxinli commented Jun 14, 2022

Co-authored-by: ggershinsky ggershinsky@users.noreply.github.com

Summary: This is to port parquet-mr decryption functionality. The main commits in parquet-mr for encryption/decryption are apache/parquet-java@65b95fb and several other fixes. This change only port the decryption only.

Test plan - (Please fill in how you tested your changes)

This feature was tested in the Uber environment and then rolled out to production for 2+ years.

Fill in the release notes towards the bottom of the PR description.
See Release Notes Guidelines for details.

== RELEASE NOTES ==

General Changes
* Add decryption functionality to Presto. When a Parquet file is encrypted following [Parquet Modular Encryption](https://github.com/apache/parquet-format/blob/master/Encryption.md), this change enables Presto to be able to decrypt.  

Hive Changes
* No flag is introduced. Presto-Hive was changed by adding the loading DecryptionPropertiesFactory(implemented in parquet-mr) and using it to get the file decryptor and pass it to presto-parquet.  

@shangxinli shangxinli requested a review from a team as a code owner June 14, 2022 14:41
@shangxinli shangxinli requested a review from presto-oss June 14, 2022 14:41
Co-authored-by: ggershinsky <ggershinsky@users.noreply.github.com>

Summary: This is to port parquet-mr decryption apache/parquet-java@65b95fb
@shangxinli shangxinli force-pushed the column_indexes_dev_new_4_rebase.new.new branch from 057eee3 to 7bf6a2b Compare June 14, 2022 19:35
@shangxinli
Copy link
Collaborator Author

shangxinli commented Jun 14, 2022

This is the PR from old PR #17791. This PR resolved the conflict with HudiParquetPageSource manually. For all other changes, it was just 'git cherry-pick' without any conflict.

The path to get to this PR is: #17479 -> #17728 -> #17791 -> #17881 (this PR). If you want to find out all the review comments, please read all the PRs' comments.

Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work, @shangxinli

@zhenxiao
Copy link
Collaborator

@kewang1024 @beinan this PR looks good.
following your approvals in #17791, I am willing to merge this PR soon. let me know if you have any new comments about this PR

@zhenxiao
Copy link
Collaborator

based on approvals in #17791, I am merging this PR

@zhenxiao zhenxiao merged commit 71dc62a into prestodb:master Jun 16, 2022
@shangxinli shangxinli deleted the column_indexes_dev_new_4_rebase.new.new branch June 17, 2022 17:58
@highker highker mentioned this pull request Jul 6, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants