Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rescaled decimals in optimized parquet reader #15713

Merged
merged 8 commits into from
Jan 26, 2023

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Jan 13, 2023

Description

Support rescaled decimals in optimized parquet reader

Additional context and related issues

This completes the support for flat column types in optimized parquet reader

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Hudi, Delta, Iceberg
* Improve performance of reading decimal columns from parquet files. ({issue}`15713`)

raunaqmorarka and others added 2 commits January 20, 2023 12:53
Allows using the optimized plain encoding decoder for BINARY.
Added testing for BINARY backed decimals in AbstractTestParquetReader
Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
raunaqmorarka and others added 6 commits January 20, 2023 19:17
Co-authored-by: Krzysztof Skrzypczynski <krzysztof.skrzypczynski@starburstdata.com>
Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
Co-authored-by: Raunaq Morarka <raunaqmorarka@gmail.com>
Co-authored-by: Krzysztof Skrzypczynski <krzysztof.skrzypczynski@starburstdata.com>
RLE, BIT_PACKED are used only for definition/repetition levels,
dictionary ids and boolean values
Since all the necessary types are expected to be supported now,
we will fallback to old column readers for flat column types only
when the optimized reader is explicitly disabled.
This ensures that any cases not supported by the optimized reader
are not hidden while also keeping the old reader available as a
fallback behind the optimized parquet reader flag.
@raunaqmorarka raunaqmorarka merged commit 47c7e3b into trinodb:master Jan 26, 2023
@raunaqmorarka raunaqmorarka deleted the pqr-decimal-types branch January 26, 2023 09:43
@github-actions github-actions bot added this to the 407 milestone Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants