Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_deltalake with rust engine fails when mode is append and overwrite schema is enabled #2553

Closed
rtyler opened this issue May 29, 2024 · 0 comments · Fixed by #2554
Closed
Assignees
Labels
binding/python Issues for the Python package bug Something isn't working

Comments

@rtyler
Copy link
Member

rtyler commented May 29, 2024

Environment

Delta-rs version: current

Binding: python


Bug

What happened:

When switching a Python writer to use the Rust engine (while testing #2486) I've discovered that the Rust engine doesn't errors on the following code only for the Rust engine:

@pytest.mark.parametrize("engine", ["pyarrow", "rust"])
def test_roundtrip_with_overwrite_schema(
    tmp_path: pathlib.Path, sample_data: pa.Table, engine: Literal["pyarrow", "rust"]
):
    write_deltalake(tmp_path, sample_data, mode='append', overwrite_schema=True, engine=engine)

Fails with:

_internal.DeltaError: Generic DeltaTable error: Schema overwrite not supported for Append

What you expected to happen:

I would expect the engine change to be seamless

More details:

🤕

@rtyler rtyler added bug Something isn't working binding/python Issues for the Python package labels May 29, 2024
@rtyler rtyler added this to the Change Data Capture Support milestone May 29, 2024
@rtyler rtyler self-assigned this May 29, 2024
rtyler added a commit to rtyler/delta-rs that referenced this issue May 29, 2024
…rect behavior

Uses of mode='append' and overwrite_schema=True lead to inconsistent
behavior between Rust and PyArrow engines for write_deltalake. In the
PyArrow case the parameter is quietly omitted so users may experience
unexpected behavior since schemas will not actually be overridden.

Users of this parameter set most likely want schema_mode='merge' which
would allow for schema evolution on appends to a Delta Table

Fixes delta-io#2553
rtyler added a commit to rtyler/delta-rs that referenced this issue May 29, 2024
…rect behavior

Uses of mode='append' and overwrite_schema=True lead to inconsistent
behavior between Rust and PyArrow engines for write_deltalake. In the
PyArrow case the parameter is quietly omitted so users may experience
unexpected behavior since schemas will not actually be overridden.

Users of this parameter set most likely want schema_mode='merge' which
would allow for schema evolution on appends to a Delta Table

Fixes delta-io#2553
rtyler added a commit that referenced this issue May 29, 2024
…rect behavior

Uses of mode='append' and overwrite_schema=True lead to inconsistent
behavior between Rust and PyArrow engines for write_deltalake. In the
PyArrow case the parameter is quietly omitted so users may experience
unexpected behavior since schemas will not actually be overridden.

Users of this parameter set most likely want schema_mode='merge' which
would allow for schema evolution on appends to a Delta Table

Fixes #2553
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant