Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory usage accounting of parquet writer #18756

Merged
merged 5 commits into from
Aug 22, 2023

Conversation

raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Aug 21, 2023

Description

Improve memory usage accounting of parquet writer

Additional context and related issues

Fixes #18557

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Delta, Iceberg
* Improve memory usage accounting of parquet writer. ({issue}`18756`)

Code related to writing dictionaries in parquet-mr is moved to
Trino to allow memory accounting related fixes in subsequent changes.
Existing functionality is preserved in this commit.
Initial writer holds on to memory for dictionary
When fallback to PLAIN encoding takes place on first page,
there is no need to retain DictionaryValuesWriter contents
@raunaqmorarka
Copy link
Member Author

@raunaqmorarka raunaqmorarka merged commit 066417f into trinodb:master Aug 22, 2023
@raunaqmorarka raunaqmorarka deleted the pqw-dict-mem branch August 22, 2023 06:19
@github-actions github-actions bot added this to the 425 milestone Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Memory tracking issue: worker OOM in PrimitiveColumnWriter
3 participants