-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Iceberg statistics loading after snapshot expiration #17356
Fix Iceberg statistics loading after snapshot expiration #17356
Conversation
Previously getting stats would fail with NPE: ``` Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.iceberg.Snapshot.parentId()" because the return value of "org.apache.iceberg.Table.snapshot(long)" is null at io.trino.plugin.iceberg.TableStatisticsReader$2.computeNext(TableStatisticsReader.java:328) at io.trino.plugin.iceberg.TableStatisticsReader$2.computeNext(TableStatisticsReader.java:322) at com.google.common.collect.AbstractSequentialIterator.next(AbstractSequentialIterator.java:74) at io.trino.plugin.iceberg.TableStatisticsReader$1.computeNext(TableStatisticsReader.java:305) at io.trino.plugin.iceberg.TableStatisticsReader$1.computeNext(TableStatisticsReader.java:284) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:146) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:141) at io.trino.plugin.iceberg.TableStatisticsReader.readNdvs(TableStatisticsReader.java:232) at io.trino.plugin.iceberg.TableStatisticsReader.makeTableStatistics(TableStatisticsReader.java:160) at io.trino.plugin.iceberg.TableStatisticsReader.getTableStatistics(TableStatisticsReader.java:93) at io.trino.plugin.iceberg.IcebergMetadata.lambda$getTableStatistics$77(IcebergMetadata.java:2501) at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708) at io.trino.plugin.iceberg.IcebergMetadata.getTableStatistics(IcebergMetadata.java:2481) ```
@Test | ||
public void testShowStatsAsOf() | ||
{ | ||
Session writeSession = withStatsOnWrite(getSession(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extract "show_stats_as_of"
to a variable.
('key', null, 4, 0, null, '1', '5'), -- NDV present, stats "inherited" from previous snapshot | ||
(null, null, null, null, 5, null, null)"""); | ||
|
||
// Re-analyzing after snapshot expired |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "snapshot expired" are you referring to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some snapshot expired
// Snapshot referenced by `previous` had no parent. | ||
return null; | ||
} | ||
return verifyNotNull(snapshot.parentId(), "snapshot.parentId()"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verifyNotNull
seem redundant here - given that we check above
if (snapshot.parentId() == null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, obviously.
it's here for readability & future proofing that we do not return end-of-data on this line accidentally.
otherwise i could just have
return snapshot.parentId();
without preceding if
on that value
Does this need release notes? @findepi |
yes, thanks! |
Previously getting stats would fail with NPE: