-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table merges fail when CDF is enabled due to column mismatches #2908
Comments
A more minimal example of the issue: import polars as pl
from deltalake import DeltaTable
df = pl.DataFrame(
{
"id": [1, 2],
"date": [1, 2],
},
schema={
# setting data types to be equal fixes the error, i.e. int & int or date & date
"id": pl.Int64,
"date": pl.Date,
},
)
table = df.to_arrow()
dt = DeltaTable.create(
table_uri="union_error",
schema=table.schema,
mode="overwrite",
partition_by=["id"], # taking out partitioning fixes the error
configuration={
"delta.enableChangeDataFeed": "true", # false fixes the error
},
)
dt.merge(
source=table,
predicate="s.id = t.id",
source_alias="s",
target_alias="t",
).when_not_matched_insert_all().execute() Running this gives the error
Modifications to the script that allow it to run without error:
The script also works fine if you have two
So the issue seems to be some weird mix of Date, partitioning, and CDF. 🤔 |
Looks like this is the same issue as #2832 |
Environment
Delta-rs version: 0.20.0
Binding: Python
Bug
What happened:
Enabling CDF results in
_internal.DeltaError: Generic DeltaTable error: Error during planning: UNION
What you expected to happen:
How to reproduce it:
More details:
Turn off the delta.enableChangeDataFeed configuration and then the merge is successful for some reason.
The text was updated successfully, but these errors were encountered: