Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BREAKING] Fix for OOM error when reading from S3 #5613

Merged
merged 11 commits into from
Jun 21, 2024

Conversation

malhotrashivam
Copy link
Contributor

@malhotrashivam malhotrashivam commented Jun 13, 2024

Context:
In the current code, each S3Context object (which is created per thread and stores any data to be used by that thread for its operations) has a cache storing hard references to the last few fragments that this thread has fetched from S3. By default, the cache size is 256. This was added to prevent recently fetched fragments from getting freed by the garbage collector. The rest of the fragments are referenced via a soft-reference based cache, shared across all threads.

Problem:
If a user configures really large fragment sizes, like 20 MB, then the local cache size grows very large and since these are all hard references, they cannot be freed by the garbage collector and can potentially lead to an Out of Memory error.

PR description:
As part of this PR, we have removed the thread local cache of hard references, and all the caching will be done via a soft reference based cache shared across all threads.

Breaking:
We have removed the cache related s3 specific instructions from the API, namely S3Instructions#maxCacheSize() from java and s3.S3Instructions.max_cache_size from python wrapper.

@malhotrashivam malhotrashivam added bug Something isn't working parquet Related to the Parquet integration NoDocumentationNeeded ReleaseNotesNeeded Release notes are needed s3 labels Jun 13, 2024
@malhotrashivam malhotrashivam added this to the June 2024 milestone Jun 13, 2024
@malhotrashivam malhotrashivam self-assigned this Jun 13, 2024
@malhotrashivam malhotrashivam changed the title Removed local cache from S3 context objects Fix for OOM error by removing caches holding hard references to S3 fragments Jun 13, 2024
chipkent
chipkent previously approved these changes Jun 17, 2024
Copy link
Member

@chipkent chipkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python LGTM

@malhotrashivam malhotrashivam changed the title Fix for OOM error by removing caches holding hard references to S3 fragments Fix for OOM error when reading from S3 Jun 20, 2024
devinrsmith
devinrsmith previously approved these changes Jun 20, 2024
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed some minor changes, but I'm ready to approve this.

@malhotrashivam malhotrashivam changed the title Fix for OOM error when reading from S3 [BREAKING] Fix for OOM error when reading from S3 Jun 21, 2024
@malhotrashivam malhotrashivam merged commit c735716 into deephaven:main Jun 21, 2024
15 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jun 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
breaking bug Something isn't working NoDocumentationNeeded parquet Related to the Parquet integration ReleaseNotesNeeded Release notes are needed s3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants