Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for memory usage of parquet column readers #15554

Merged
merged 2 commits into from
Jan 4, 2023

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Dec 29, 2022

Description

Added accounting of memory usage to batched column readers.
This takes into account the dictionary retained for decoding
dictionary data pages and the extra memory usage of decompressed
values from data pages.

Additional context and related issues

Fixes #10061

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Hudi, Iceberg, Delta
* Improve accounting of memory usage by parquet reader. ({issue}`15554`)

@cla-bot cla-bot bot added the cla-signed label Dec 29, 2022
@raunaqmorarka raunaqmorarka requested review from skrzypo987, lukasz-stec and sopel39 and removed request for skrzypo987 December 29, 2022 15:35
Copy link
Member

@lukasz-stec lukasz-stec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

Added accounting of memory usage to batched column readers.
This takes into account the dictionary retained for decoding
dictionary data pages and the extra memory usage of decompressed
values from data pages.
@raunaqmorarka raunaqmorarka merged commit 6662bc3 into trinodb:master Jan 4, 2023
@raunaqmorarka raunaqmorarka deleted the pqr-mem branch January 4, 2023 13:08
@github-actions github-actions bot added this to the 406 milestone Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Missing memory tracking in parquet PageReader can cause OOM
3 participants