You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
When attempting a write_deltalake in overwrite-mode with a simple predicate (" = "), if value is an integer, everything works as expected - any existing records under the same label are overwritten. If however value is a string (see code below), write_deltalake (through polars' write_delta) complains of an "Invalid comparison operation: Utf8 == LargeUtf8"
Traceback (most recent call last):
File "<python-input-11>", line 1, in <module>
test()
~~~~^^
File "<python-input-3>", line 5, in test
d.write_delta(tempdir / 'test.delta', mode='overwrite', delta_write_options=dict(
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
engine='rust',
^^^^^^^^^^^^^^
predicate='C = \'a\''
^^^^^^^^^^^^^^^^^^^^^
))
^^
File "/workspaces/debian-2/.venv/lib/python3.13/site-packages/polars/dataframe/frame.py", line 4305, in write_delta
write_deltalake(
~~~~~~~~~~~~~~~^
table_or_uri=target,
^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
**delta_write_options,
^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/workspaces/debian-2/.venv/lib/python3.13/site-packages/deltalake/writer.py", line 323, in write_deltalake
write_deltalake_rust(
~~~~~~~~~~~~~~~~~~~~^
table_uri=table_uri,
^^^^^^^^^^^^^^^^^^^^
...<13 lines>...
post_commithook_properties=post_commithook_properties,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
ValueError: Invalid comparison operation: Utf8 == LargeUtf8
What you expected to happen:
For predicate to also accept strings, or at least the error message being less cryptic.
More details: Use case: compiling periodic, discrete data dumps from an operational system into a central deltatable for further analysis. To avoid duplicate records when the same data-dump-file is accidentally processed twice, we added a column to the deltatable with the name of the input file, and tried "overwrite" mode together with the "predicate" argument instead of "append".
The text was updated successfully, but these errors were encountered:
This is quite strange; I thought I updated all the places where we did this type of expression coercions. I will dive a bit deeper into this over the weekend
Environment
Delta-rs version: 0.24.0
Binding: python
Environment:
Bug
What happened:
When attempting a
write_deltalake
in overwrite-mode
with a simplepredicate
(" = "), if value is an integer, everything works as expected - any existing records under the same label are overwritten. If however value is a string (see code below),write_deltalake
(through polars'write_delta
) complains of an "Invalid comparison operation: Utf8 == LargeUtf8"What you expected to happen:
For
predicate
to also accept strings, or at least the error message being less cryptic.How to reproduce it:
More details:
Use case: compiling periodic, discrete data dumps from an operational system into a central deltatable for further analysis. To avoid duplicate records when the same data-dump-file is accidentally processed twice, we added a column to the deltatable with the name of the input file, and tried "overwrite" mode together with the "predicate" argument instead of "append".
The text was updated successfully, but these errors were encountered: