Skip to content

Commit

Permalink
Avoid over retaining memory for strings in parquet writer
Browse files Browse the repository at this point in the history
Used Binary.fromReusedByteBuffer to ensure that DictionaryValuesWriter.PlainBinaryDictionaryValuesWriter
copies specific positions rather than retaining entire underlying byte array from the Block
  • Loading branch information
raunaqmorarka authored and wendigo committed May 2, 2024
1 parent f060427 commit 06505f2
Showing 1 changed file with 3 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,9 @@ public void write(Block block)
for (int i = 0; i < block.getPositionCount(); i++) {
if (!block.isNull(i)) {
Slice slice = type.getSlice(block, i);
Binary binary = Binary.fromConstantByteBuffer(slice.toByteBuffer());
// fromReusedByteBuffer must be used instead of fromConstantByteBuffer to avoid retaining entire
// base byte array of the Slice in DictionaryValuesWriter.PlainBinaryDictionaryValuesWriter
Binary binary = Binary.fromReusedByteBuffer(slice.toByteBuffer());
valuesWriter.writeBytes(binary);
getStatistics().updateStats(binary);
}
Expand Down

0 comments on commit 06505f2

Please sign in to comment.