-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scan: fix error when reading an empty table #608
base: main
Are you sure you want to change the base?
Conversation
Just some notes for reviewers: I'm not 100% sure that this the best approach to fix this issue, I've just tried to follow the same approach used on Java and Python implementation, but I don't know if there is a better way to implement in Rust. Another point is that I'm bit confusing where should I write a test case for this issue? |
39b8da9
to
abcd7f8
Compare
Thanks for the contribution! Do we need to address this inside scan though? Why let someone build a This can be handled instead in the code that invokes let scan_builder = table.scan();
// (customize builder here if reqd)...
let Ok(scan) = scan_builder.build() else {
return Ok(stream::empty().boxed());
};
scan().plan_files() |
Hi @sdd , thanks for your review. I'm not sure if I understand your suggestion. I agree that would be better to fix this edge case with a smaller change, but I'm not sure If I understand your suggestion correctly. The idea would be make the callers of Just adding another idea: would make sense to return an error like |
Just to clarify, not having any snapshots is not necessarily the same as not having any data. If there is no current snapshot then there can't be any data, but someone could delete all data from a table, resulting in there being a snapshot, but no data. The existing code would handle this second case just fine - we only need to handle the issue of no snapshots. |
abcd7f8
to
a72aae6
Compare
@sdd I've changed the code to return a |
We've been very selective when it comes to adding new values to |
Previously TableScan struct was requiring a Snapshot to plan files and for empty tables without a snapshot an error was being returned instead of an empty result.
Following the same approach of Java [0] and Python [1] implementation this commit change the snapshot property to accept None values and the
plan_files
method was also changed to return an empty stream if the snapshot is not present on on PlanContext.[0] https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotScan.java#L119
[1] https://github.com/apache/iceberg-python/blob/main/pyiceberg/table/__init__.py#L1979
Fixes: #580