-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix memory bloat caused by holding too many unclosed ArrowReaderIterator
s
#929
Conversation
} | ||
|
||
batch = nextBatch() | ||
if (batch.isEmpty) { | ||
close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because I am paranoid and want to close the iterator early. I can revert changes to this file if you don't like it :)
var currentReadIterator: ArrowReaderIterator = null | ||
|
||
// Closes last read iterator after the task is finished. | ||
// We need to close read iterator during iterating input streams, | ||
// instead of one callback per read iterator. Otherwise if there are too many | ||
// read iterators, it may blow up the call stack and cause OOM. | ||
context.addTaskCompletionListener[Unit] { _ => | ||
if (currentReadIterator != null) { | ||
currentReadIterator.close() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the issue description, this is the major fix, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
This looks good to me. @Kontinuation Could you rebase this? I would like to make sure this change can pass CI after latest CI change in main branch. Thank you. |
1674eb9
to
86aee79
Compare
Rebased. One of the tests failed because of failing to download dependencies from maven central, it should be transient. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #929 +/- ##
=============================================
+ Coverage 34.03% 54.87% +20.83%
+ Complexity 883 853 -30
=============================================
Files 113 109 -4
Lines 43170 10707 -32463
Branches 9516 2053 -7463
=============================================
- Hits 14693 5875 -8818
+ Misses 25471 3798 -21673
+ Partials 3006 1034 -1972 ☔ View full report in Codecov by Sentry. |
Thanks. I re-triggered the test. Usually it will pass after that. |
Thanks @Kontinuation |
Which issue does this PR close?
Closes #927.
Rationale for this change
Please refer to the comments of #927 for details.
What changes are included in this PR?
ArrowReaderIterator
inCometBlockStoreShuffleReader
.ArrowReaderIterator
when reading the end of the stream.How are these changes tested?
It is pretty hard to add tests for this fix, so we manually tested this and relying on existing tests to make sure that it does not break anything.