-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Fix changelog table bug for start time older than current snapshot #11564
Conversation
Thank you, @Acehaidrey, for reporting this issue! It seems that the method
However, I believe there's an opportunity to revisit the |
Hi @bryanck, do we still have time to include this bug fix in 1.7.1? |
The 1.7.1 ship has sailed, but if this is a regression or high priority, we can try to get it in. |
It is a regression, better to get it in. @Acehaidrey should be able to provide a quick fix soon, like today. Appreciate if you can hold 1.7.1 a bit more, @bryanck. |
|
@flyrain + @Acehaidrey We expect this test to fail right? Like we need to still add a fix to this pr? |
That's correct! |
Yes I can work on this in about one hour if that’s okay
…On Wed, Nov 20, 2024 at 11:59 AM Yufei Gu ***@***.***> wrote:
We expect this test to fail right? Like we need to still add a fix to this
pr?
That's correct!
—
Reply to this email directly, view it on GitHub
<#11564 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPNAR5NCCP65NCVQ32VLCD2BS5YVAVCNFSM6AAAAABR43PBQCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBZGEYTQMBVGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hey @sfc-gh-ygu @flyrain @RussellSpitzer @bryanck I have gone ahead and updated this . Please if you can take a look - the test passes now as do the other tests. Sorry for delays, home had a water leak today, and been a mess. |
...5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestChangelogTable.java
Outdated
Show resolved
Hide resolved
Fixed! Sorry I missed that
…On Wed, Nov 20, 2024 at 7:10 PM Yufei Gu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In
spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestChangelogTable.java
<#11564 (comment)>:
> + + " 'start-timestamp', '%d',"
+ + " 'end-timestamp', '%d'"
+ + " ),"
+ + " changelog_view => 'test_changelog_view'"
+ + ")",
+ catalogName, tableName, startTime, endTime);
+
+ // Query the changelog view
+ List<Object[]> results =
+ sql(
+ "SELECT * FROM test_changelog_view WHERE _change_type IN ('INSERT', 'DELETE') ORDER BY _change_ordinal");
+
+ // Verify no changes are returned since our window is after the inserts
+ assertThat(results).as("Num records must be zero").isEmpty();
+
+
nit: remove the extra empty line?
—
Reply to this email directly, view it on GitHub
<#11564 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACPNAR6LLRHMGSZZRF5QD732BUQGXAVCNFSM6AAAAABR43PBQCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDINBZHE3DOMBUHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, left minor comments.
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java
Outdated
Show resolved
Hide resolved
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java
Outdated
Show resolved
Hide resolved
thank you @flyrain for all the help here - I actually took your advice, think it looks cleaner this way if you dont mind seeing once more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending tests
Thank you, I had to fix a formatting issue so pushed another update |
The bug was due to we were not checking the timestamp of |
Fix changelog table bug for start time older than current snapshot
Problem
After upgrading to Iceberg 1.5 and Spark 3.5.1, the
create_changelog_view
procedure occasionally returns records that were inserted outside (before) the specified time range. Specifically:Changes
Added a unit test
testChangelogViewOutsideTimeRange
toTestChangelogTable
that:Testing
To invoke the unit test:
./gradlew :iceberg-spark:iceberg-spark-extensions-3.5_2.12:test --tests "org.apache.iceberg.spark.extensions.TestChangelogTable"
Related Issue
TBD
Additional Context