-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use write api #20800
Use write api #20800
Conversation
/test-with-secrets sha=456142157c288d25d917965279ad2627008d7c43 |
4561421
to
d8f54d8
Compare
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/7999002596 |
/test-with-secrets sha=d8f54d89f31deee461f75758672ec579d0d9333c |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/7999104709 |
/test-with-secrets sha=db78942fd7aa85ae23b51c3521f3eb5d82f66e7e |
Can you update the file system tests to verify that the file doesn’t exist until the output stream is closed? |
db78942
to
bfb3b1f
Compare
bfb3b1f
to
bf9f124
Compare
lib/trino-filesystem-gcs/src/test/java/io/trino/filesystem/gcs/TestGcsFileSystemConfig.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem-gcs/src/main/java/io/trino/filesystem/gcs/GcsStorageFactory.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem-gcs/src/test/java/io/trino/filesystem/gcs/TestGcsFileSystemGcs.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem-gcs/src/test/java/io/trino/filesystem/gcs/TestGcsFileSystemGcs.java
Outdated
Show resolved
Hide resolved
/test-with-secrets sha=bf9f124989286f046fece43bad73613d3d38f788 |
bf9f124
to
4f02b29
Compare
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/8001732829 |
4f02b29
to
7e7d44c
Compare
/test-with-secrets sha=7e7d44c1a71902875866fab97ae1bc27c87bac69 |
The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/8019094596 |
} | ||
|
||
@Test | ||
@Disabled("This test takes 10 minutes to run locally") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writing 32M takes 10 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep (!) writing and reading the file every 8k bytes to verify that it's not changed until the outputstream is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it suffice to verify it once just before writing last 8k,
or after writing last 8k and before doing close (= committing the write)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Trying that out and will push
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and now it takes 18 seconds:) removing from @Disabled
. Thanks for that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14 seconds for the other test, enabled that one as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now down to about 9.7 - 10 seconds
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
lib/trino-filesystem/src/test/java/io/trino/filesystem/AbstractTestTrinoFileSystem.java
Outdated
Show resolved
Hide resolved
efdaffa
to
c4e3a3a
Compare
lib/trino-filesystem-gcs/src/main/java/io/trino/filesystem/gcs/GcsOutputStream.java
Show resolved
Hide resolved
for (int i = 0; i < bytes.length / 4; i++) { | ||
//noinspection NumericCastThatLosesPrecision | ||
bytes[i] = (byte) i; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow what is happening here.
Why are we filling only a quarter of the array?
This snippet is used in another method below as well - extract in a helper method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the hint from #20800 (comment) was not correctly applied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the hint was correct: the contents don't matter as long as they differ from the original contents - +1 on extracting, will do. lmk if you have any concern w any of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow what is happening here. Why are we filling only a quarter of the array?
This snippet is used in another method below as well - extract in a helper method.
Update: misread this. This was a bug, thanks for catching! Also extracted to a helper method.
} | ||
try (OutputStream outputStream = getFileSystem().newOutputFile(location).createOrOverwrite()) { | ||
// write a 32 mb file | ||
for (int i = 0; i < divide(32 * 1024 * 1024, bytes.length, UNNECESSARY); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 32 MB really necessary ? We do control the size of array being written (8K). Is there local caching that you're trying to avoid ?
There may be some issues on the CI when testing from a Github runner which is far from the region of the GCS bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were originally to test gcs filesystem: the write buffer under the hood is 16mb, so wanted to choose a size that would ensure we were exercising the case where the buffer flushes to storage. I can do 20mb - currently the 2 tests take 18 seconds and 14 seconds respectively, lmk if you'd like the size lowered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make it 17MB or 18MB
} | ||
|
||
@Test | ||
public void testLargeFileDoesNotOverwriteUntilClosed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testLargeFileDoesNotOverwriteUntilClosed
-> testLargeFileDoesIsNotOverwrittenUntilClosed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about combining the 2? :) testLargeFileIsNotOverwrittenUntilClosed
?
lib/trino-filesystem-gcs/src/main/java/io/trino/filesystem/gcs/GcsOutputStream.java
Show resolved
Hide resolved
getFileSystem().deleteFile(location); | ||
byte[] bytes = new byte[8192]; | ||
for (int i = 0; i < bytes.length / 4; i++) { | ||
//noinspection NumericCastThatLosesPrecision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this @SuppressWarnings
void testFileDoesNotExistUntilClosed() | ||
throws Exception | ||
{ | ||
Location location = getRootLocation().appendPath("testFile2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's give these better names
public void testLargeFileDoesNotExistUntilClosed() | ||
throws IOException | ||
{ | ||
Location location = getRootLocation().appendPath("testFile3"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
bytes[i] = (byte) i; | ||
} | ||
try (OutputStream outputStream = getFileSystem().newOutputFile(location).create()) { | ||
// write a 32 mb file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: use uppercase MB to match other usages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What buffer size do we care about here? If this is for a 16MB buffer, we should use 17MB, or 33MB for a 32MB buffer, to ensure we have written more than the buffer flush size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
16mb, so will try 17mb. But that only applies to gcs.
Location location = getRootLocation().appendPath("testFile3"); | ||
getFileSystem().deleteFile(location); | ||
byte[] bytes = new byte[8192]; | ||
for (int i = 0; i < bytes.length / 4; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be / 4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! didn't catch that when I changed the function.
} | ||
try (OutputStream outputStream = getFileSystem().newOutputFile(location).create()) { | ||
// write a 32 mb file | ||
for (int i = 0; i < divide(32 * 1024 * 1024, bytes.length, UNNECESSARY); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this divide()
call does here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was from a previous review suggestion above - to ensure that the size is divisible by the bytes.length.
} | ||
try (OutputStream outputStream = getFileSystem().newOutputFile(location).create()) { | ||
// write a 32 mb file | ||
for (int i = 0; i < divide(32 * 1024 * 1024, bytes.length, UNNECESSARY); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we do
assertThat(MEGABYTE % bytes.length).isEqualTo(0);
// write a 33 MB file
int target = 33 * MEGABYTE;
int count = 0;
while (count < target) {
outputStream.write(bytes);
count += bytes.length;
if (count % MEGABYTE == 0) {
assertFalse(fileExistsInListing(location));
assertFalse(fileExists(location));
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! They both take about 10-11seconds now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supercool. do we need to check every megabyte (33 times)?
would we lose anything if we asserted only before final flush/close?
ie do we believe it is possible for a file to first show up, then disappear, and then show up again while we're writing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - updating now, will save this version just in case.
public void testLargeFileDoesNotOverwriteUntilClosed() | ||
throws IOException | ||
{ | ||
Location location = getRootLocation().appendPath("testFile4"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
private boolean fileExistsInListing(Location location) | ||
throws IOException | ||
{ | ||
for (FileIterator fileIterator = getFileSystem().listFiles(getRootLocation()); fileIterator.hasNext(); ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this reads better as
FileIterator fileIterator = getFileSystem().listFiles(getRootLocation());
while (fileIterator.hasNext()) {
5759144
to
4520de2
Compare
throws IOException | ||
{ | ||
if (!supportsIncompleteWriteNoClobber()) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abort("...")
to mark it skipped
throws IOException | ||
{ | ||
if (!supportsIncompleteWriteNoClobber()) { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abort("...")
to mark it skipped
} | ||
|
||
@Test | ||
@SuppressWarnings("NumericCastThatLosesPrecision") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not @SuppressWarnings
on whole method (unless it's a very short method).
use either @SuppressWarnings
on "a variable declaration with assignment" or use comment-based suppression
// write a 17 MB file | ||
int target = 17 * MEGABYTE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-comment why 17 and not some other number. (The idea was to write more than assumed buffer size)
(same above)
if (count % MEGABYTE == 0) { | ||
assertEquals(fileSize, readFile(inputFile, fileBytes)); | ||
assertEquals("test", new String(fileBytes, UTF_8)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be enough to do this once, before existing from try (OutputStream outputStream = getFileSystem().newOutputFile(location).createOrOverwrite()) {
block
if (count % MEGABYTE == 0) { | ||
assertFalse(fileExistsInListing(location)); | ||
assertFalse(fileExists(location)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be enough to do this once, before existing from try (OutputStream outputStream = getFileSystem().newOutputFile(location).createOrOverwrite()) {
block
4520de2
to
340b057
Compare
there are (related) failures (eg in |
340b057
to
c638adc
Compare
c638adc
to
78d370f
Compare
Description
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: