-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle empty Iceberg tables while executing procedures #13582
Handle empty Iceberg tables while executing procedures #13582
Conversation
@@ -1194,6 +1194,10 @@ public void executeRemoveOrphanFiles(ConnectorSession session, IcebergTableExecu | |||
IcebergConfig.REMOVE_ORPHAN_FILES_MIN_RETENTION, | |||
IcebergSessionProperties.REMOVE_ORPHAN_FILES_MIN_RETENTION); | |||
|
|||
if (table.currentSnapshot() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove_orphan_files
procedure still fails on file metastore due to lack of /data
directory. Could you leave a TODO comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would it be specific to the TESTING_FILE_METASTORE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a stack trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query and stacktrace:
trino:tpch> CREATE TABLE test (c1 int);
trino:tpch> ALTER TABLE test EXECUTE remove_orphan_files(retention_threshold => '7d');
Query 20220810_221426_00026_hjz7i failed: Failed accessing data for table: tpch.test
io.trino.spi.TrinoException: Failed accessing data for table: tpch.test
at io.trino.plugin.iceberg.IcebergMetadata.scanAndDeleteInvalidFiles(IcebergMetadata.java:1255)
at io.trino.plugin.iceberg.IcebergMetadata.removeOrphanFiles(IcebergMetadata.java:1215)
at io.trino.plugin.iceberg.IcebergMetadata.executeRemoveOrphanFiles(IcebergMetadata.java:1198)
at io.trino.plugin.iceberg.IcebergMetadata.executeTableExecute(IcebergMetadata.java:1116)
at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorMetadata.executeTableExecute(ClassLoaderSafeConnectorMetadata.java:216)
at io.trino.metadata.MetadataManager.executeTableExecute(MetadataManager.java:345)
at io.trino.operator.SimpleTableExecuteOperator.getOutput(SimpleTableExecuteOperator.java:128)
at io.trino.operator.Driver.processInternal(Driver.java:410)
at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
at io.trino.operator.Driver.tryWithLock(Driver.java:698)
at io.trino.operator.Driver.process(Driver.java:305)
at io.trino.operator.Driver.processForDuration(Driver.java:276)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:740)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:164)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:490)
at io.trino.$gen.Trino_testversion____20220810_221353_71.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.FileNotFoundException: File /var/folders/8s/dkvf18z55lj_9yxhy1n54sph0000gn/T/TrinoTest1711210759979949419/iceberg_data/tpch/test-b3dc0ba83a6542229b672271f21d09eb/data does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:489)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1910)
at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:2072)
at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2071)
at org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:700)
at org.apache.hadoop.fs.FileSystem$5.<init>(FileSystem.java:2183)
at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:2180)
at io.trino.plugin.hive.fs.TrinoFileSystemCache$FileSystemWrapper.listFiles(TrinoFileSystemCache.java:386)
at io.trino.plugin.iceberg.IcebergMetadata.scanAndDeleteInvalidFiles(IcebergMetadata.java:1239)
... 18 more
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Show resolved
Hide resolved
...roduct-tests/src/main/java/io/trino/tests/product/iceberg/TestIcebergSparkCompatibility.java
Outdated
Show resolved
Hide resolved
If a table was just created it may not contain any snapshots. Procedures run on tables that do not contain any snapshots can safely do nothing.
7594677
to
9dee7a0
Compare
All set, thanks for the reviews |
@alexjo2144 can you please suggest RN wording? |
CI hit #13556 |
@findepi added a suggestion to the PR description |
Description
If a table was just created it may not contain any snapshots. Procedures run on tables that do not contain any snapshots can safely do nothing.
Fix
Iceberg connector
Prevent query failure in the edge case where a table is empty and has no history.
Related issues, pull requests, and links
Related to: #13576
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: