-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account for memory allocated by trino s3 staging output stream #14407
Account for memory allocated by trino s3 staging output stream #14407
Conversation
da5abd6
to
d9ca5e1
Compare
c19095e
to
6f3e3cd
Compare
6f3e3cd
to
4e4368d
Compare
lib/trino-filesystem/src/main/java/io/trino/filesystem/hdfs/HdfsOutputFile.java
Outdated
Show resolved
Hide resolved
implements MemoryAware | ||
{ | ||
private final FSDataOutputStream delegate; | ||
private final Supplier<Long> getOutputStreamRetainedSizeInBytesSupplier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOutputStreamRetainedSizeInBytesSupplier
-> retainedSizeInBytesSupplier
} | ||
|
||
@Override | ||
public String location() | ||
{ | ||
return path; | ||
} | ||
|
||
private static class MemoryAwareOutputStreamWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just extend io.trino.hdfs.TrinoFileSystemCache.OutputStreamWrapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just extend io.trino.hdfs.TrinoFileSystemCache.OutputStreamWrapper
it is private i didn't want to change that, but sure I can do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is private i didn't want to change that, but sure I can do it
Oh, but I mean you can make OutputStreamWrapper
do what MemoryAwareOutputStreamWrapper
does (provide MemoryAware
interface for wrapper stream)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, right
@@ -40,6 +42,8 @@ | |||
private final String outputPath; | |||
private final TrinoFileSystem fileSystem; | |||
|
|||
private Supplier<Long> getOutputStreamRetainedSizeInBytesSupplier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOutputStreamRetainedSizeInBytesSupplier
-> outputStreamRetainedSizeInBytesSupplier
@Override | ||
public long getRetainedSizeInBytes() | ||
{ | ||
// TODO find a better way to estimate this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If transferManager
does not hold for memory outside between TrinoS3StagingOutputStream
calls, then it should be 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not sure about this but you are right probably
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergParquetFileWriter.java
Outdated
Show resolved
Hide resolved
4e4368d
to
32e2c21
Compare
@@ -401,7 +401,7 @@ public RemoteIterator<LocatedFileStatus> listFiles(Path path, boolean recursive) | |||
} | |||
} | |||
|
|||
private static class OutputStreamWrapper | |||
public static class OutputStreamWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep it private
|
||
public class OutputStreamOrcDataSink | ||
implements OrcDataSink | ||
{ | ||
private static final int INSTANCE_SIZE = toIntExact(ClassLayout.parseClass(OutputStreamOrcDataSink.class).instanceSize()); | ||
|
||
private final OutputStreamSliceOutput output; | ||
private final Supplier<Long> getOutputStreamRetainedSizeInBytesSupplier; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOutputStreamRetainedSizeInBytesSupplier
-> outputStreamRetainedSizeInBytesSupplier
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergParquetFileWriter.java
Outdated
Show resolved
Hide resolved
I think clearer approach is to use
then in This way there is less intermediate layers and |
I can't because |
e3c9d0a
to
247bcee
Compare
You can add |
That is not the only problem. Those
and simply creating instances of |
571ff0d
to
81ae13f
Compare
@sopel39 please take another look.
in |
lib/trino-filesystem/src/main/java/io/trino/filesystem/fileio/ForwardingOutputFile.java
Show resolved
Hide resolved
return environment.doAs(context.getIdentity(), () -> fileSystem.create(file, overwrite)); | ||
} | ||
|
||
public static FileSystem getRawFileSystem(FileSystem fileSystem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private
lib/trino-filesystem/src/main/java/io/trino/filesystem/MemoryAwareFileSystem.java
Show resolved
Hide resolved
import java.io.IOException; | ||
import java.io.OutputStream; | ||
|
||
public interface TrinoOutputFile | ||
{ | ||
OutputStream create() | ||
OutputStream create(AggregatedMemoryContext aggregatedMemoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggregatedMemoryContext aggregatedMemoryContext
-> AggregatedMemoryContext memoryContext
throws IOException; | ||
|
||
OutputStream createOrOverwrite() | ||
OutputStream createOrOverwrite(AggregatedMemoryContext aggregatedMemoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggregatedMemoryContext aggregatedMemoryContext
-> AggregatedMemoryContext memoryContext
@@ -40,27 +43,39 @@ public HdfsOutputFile(String path, HdfsEnvironment environment, HdfsContext cont | |||
} | |||
|
|||
@Override | |||
public OutputStream create() | |||
public OutputStream create(AggregatedMemoryContext aggregatedMemoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggregatedMemoryContext aggregatedMemoryContext
-> AggregatedMemoryContext memoryContext
|
||
public class OutputStreamOrcDataSink | ||
implements OrcDataSink | ||
{ | ||
private static final int INSTANCE_SIZE = toIntExact(ClassLayout.parseClass(OutputStreamOrcDataSink.class).instanceSize()); | ||
|
||
private final OutputStreamSliceOutput output; | ||
private final AggregatedMemoryContext aggregatedMemoryContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aggregatedMemoryContext
-> outputMemoryContext
|
||
public OutputStreamOrcDataSink(OutputStream outputStream) | ||
public OutputStreamOrcDataSink(OutputStream outputStream, AggregatedMemoryContext aggregatedMemoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this constructor private.
Add factory methods:
OutputStreamOrcDataSink OutputStreamOrcDataSink#create(TrinoOutputFile outputFile) {
AggregatedMemoryContext context = newSimpleAggregatedMemoryContext();
return new OutputStreamOrcDataSink(outputFile.create(context), context);
}
/**
* Use only when output stream memory consumption tracking is not possible.
*/
OutputStreamOrcDataSink OutputStreamOrcDataSink#create(OutputStream stream) {
return new OutputStreamOrcDataSink(strea, newSimpleAggregatedMemoryContext());
}
this.memoryContext = requireNonNull(aggregatedMemoryContext, "aggregatedMemoryContext is null") | ||
.newLocalMemoryContext(TrinoS3StreamingOutputStream.class.getSimpleName()); | ||
this.buffer = new byte[0]; | ||
this.memoryContext.setBytes(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
plugin/trino-hive/src/main/java/io/trino/plugin/hive/s3/TrinoS3FileSystem.java
Show resolved
Hide resolved
} | ||
else { | ||
this.buffer = new byte[0]; | ||
memoryContext.setBytes(buffer.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: buffer.length
-> 0
@@ -170,7 +174,7 @@ private IcebergFileWriter createParquetWriter( | |||
.collect(toImmutableList()); | |||
|
|||
try { | |||
OutputStream outputStream = fileSystem.newOutputFile(outputPath).create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it TrinoOutputFile outputFile = fileSystem.newOutputFile(outputPath)
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergFileWriterFactory.java
Outdated
Show resolved
Hide resolved
6ec2b56
to
aef4280
Compare
lib/trino-filesystem/src/main/java/io/trino/filesystem/MemoryAwareFileSystem.java
Show resolved
Hide resolved
lib/trino-orc/src/main/java/io/trino/orc/OutputStreamOrcDataSink.java
Outdated
Show resolved
Hide resolved
this.memoryContext = requireNonNull(memoryContext, "memoryContext is null"); | ||
} | ||
|
||
public static OutputStreamOrcDataSink create(TrinoOutputFile outputFile, AggregatedMemoryContext memoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static constructor above OutputStreamOrcDataSink
constructor
@@ -1491,7 +1503,8 @@ public TrinoS3StagingOutputStream( | |||
File tempFile, | |||
Consumer<PutObjectRequest> requestCustomizer, | |||
long multiPartUploadMinFileSize, | |||
long multiPartUploadMinPartSize) | |||
long multiPartUploadMinPartSize, | |||
AggregatedMemoryContext memoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove? you don't account for mem in TrinoS3StagingOutputStream
@@ -1634,6 +1652,9 @@ public TrinoS3StreamingOutputStream( | |||
this.uploadExecutor = requireNonNull(uploadExecutor, "uploadExecutor is null"); | |||
this.buffer = new byte[0]; | |||
this.initialBufferSize = 64; | |||
this.memoryContext = requireNonNull(memoryContext, "memoryContext is null") | |||
.newLocalMemoryContext(TrinoS3StreamingOutputStream.class.getSimpleName()); | |||
this.buffer = new byte[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buffer is already created above
public void testThatTrinoS3FileSystemReportsConsumedMemory() | ||
throws IOException | ||
{ | ||
AggregatedMemoryContext memoryContext = mock(AggregatedMemoryContext.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't really have to mock these. Use real objects (e.g io.trino.memory.context.AggregatedMemoryContext#newRootAggregatedMemoryContext
with reservation handler to track max allocation
7d0718e
to
5dc2be3
Compare
@@ -54,7 +56,8 @@ public IcebergParquetFileWriter( | |||
CompressionCodecName compressionCodecName, | |||
String trinoVersion, | |||
String outputPath, | |||
TrinoFileSystem fileSystem) | |||
TrinoFileSystem fileSystem, | |||
AggregatedMemoryContext memoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar as for OutputStreamOrcDataSink
. Create AggregatedMemoryContext
locally in ParquetFileWriter
.
You can then simplify calls to createParquetFileWriter
@@ -75,10 +76,11 @@ public ParquetFileWriter( | |||
String trinoVersion, | |||
boolean useBatchColumnReadersForVerification, | |||
Optional<DateTimeZone> parquetTimeZone, | |||
Optional<Supplier<ParquetDataSource>> validationInputFactory) | |||
Optional<Supplier<ParquetDataSource>> validationInputFactory, | |||
AggregatedMemoryContext memoryContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create memory context locally and override getMemoryUsage
method
plugin/trino-hive/src/test/java/io/trino/plugin/hive/s3/TestTrinoS3FileSystem.java
Show resolved
Hide resolved
OutputStream outputStream = fs.create(new Path("s3n://test-bucket/test1"), memoryContext); | ||
outputStream.write(new byte[] {1, 2, 3, 4, 5, 6}, 0, 6); | ||
} | ||
assertThat(memoryReservationHandler.getReserved()).isGreaterThanOrEqualTo(64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why reserved
is not 0? Mem leak?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a memory leak. in the test I simply don't close outputStream, memory context is not closed and thus not set to 0.
I did it on purpose to be able to catche the value. But I can do it better.
fs.initialize(rootPath.toUri(), newEmptyConfiguration()); | ||
fs.setS3Client(s3); | ||
OutputStream outputStream = fs.create(new Path("s3n://test-bucket/test1"), memoryContext); | ||
outputStream.write(new byte[] {1, 2, 3, 4, 5, 6}, 0, 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close outputStream
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh I didn't notice this before answering the comment above ;) it is done now
5dc2be3
to
650cc11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
plugin/trino-hive/pom.xml
Outdated
@@ -448,6 +448,13 @@ | |||
<scope>test</scope> | |||
</dependency> | |||
|
|||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this dependency
@@ -1691,6 +1706,8 @@ public void close() | |||
|
|||
try { | |||
flushBuffer(true); | |||
memoryContext.setBytes(0); | |||
memoryContext.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just close
is sufficient (no need for setBytes
)
outputStream.close(); | ||
} | ||
assertThat(memoryReservationHandler.getReserved()).isEqualTo(0); | ||
assertThat(memoryReservationHandler.getSetValues()).contains(64L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just store max
and make sure that assertThat(memoryReservationHandler.getMaxReserver).isGreatherThan(0);
Checking that 64 is part of reserved values is fragile (depends on implementation)
650cc11
to
f884858
Compare
Description
Account for memory allocated by TrinoS3FileSystems
Fixes: #14023
Non-technical explanation
Some queries that previously killed cluster should be gracefully terminated now
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: