-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIF panic with list of struct #1011
Comments
Hi @maennchen! We appreciate you putting our struct dtype logic through its paces :P I'll take a look at this today or tomorrow. Related, I want to add a property test like this one but for The generators would be similar, but the goal would be to see if we get a panic. Hopefully such a test would catch more bugs like these up front. |
Progress so farI've boiled down the problem to this: dtypes = [{"col", {:list, {:struct, [{"field", :category}]}}}]
content = %{"field" => "example"}
works = [[content], []]
fails = [[], [content]]
DF.new([{"col", works}], dtypes: dtypes) |> dbg #=> works
DF.new([{"col", fails}], dtypes: dtypes) |> dbg #=> fails with panic The actual panic occurs here: explorer/native/explorer/src/encoding.rs Line 738 in 07f80fb
The But the real problem stems from how we built the series originally: explorer/native/explorer/src/series/from_list.rs Lines 428 to 431 in 07f80fb
Somehow, the result of This seems like a bug in Polars to me. But maybe EDIT: Managed to reproduce this in Python. I think this is a Polars bug. >>> import polars as pl
>>> dtype = pl.datatypes.List(pl.datatypes.Struct({"field": pl.datatypes.Categorical()}))
>>> pl.DataFrame({"col": [[{"field": "example"}], []]}, schema={"col": dtype})
shape: (2, 1)
┌─────────────────┐
│ col │
│ --- │
│ list[struct[1]] │
╞═════════════════╡
│ [{"example"}] │
│ [] │
└─────────────────┘
>>> pl.DataFrame({"col": [[], [{"field": "example"}]]}, schema={"col": dtype})
thread '<unnamed>' panicked at /Users/runner/work/polars/polars/crates/polars-arrow/src/array/binview/mod.rs:308:9:
assertion failed: i < self.len()
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/to/.venv/lib/python3.12/site-packages/polars/dataframe/frame.py", line 1149, in __repr__
return self.__str__()
^^^^^^^^^^^^^^
File "/path/to/.venv/lib/python3.12/site-packages/polars/dataframe/frame.py", line 1146, in __str__
return self._df.as_str()
^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: assertion failed: i < self.len() |
* Allow specifying the dtype generator directly Before we could only specify the leaf part of the dtype tree generator. This change lets us alternately specify the whole thing. * Remove all `:skip` tags If we exclude `:category`, we can avoid #1011. Note: there may still be issues with `:category` but we won't be able to find them until #1011 is resolved.
Code
Expected
A working Dataframe
Actual
Note: This error happens while inspecting the result. If I however pass it on to
Explorer.DataFrame.dump_ndjson/1
, I also get a segmentation fault.Context
In production, the error looks slightly different, but I'm unable to reproduce the exact error without providing the confidential information contained. I can however provide the error without stacktrace:
Since the part with
assertion failed: i < self.len()
is the same, I think the provided reproduction should represent the error sufficiently.The text was updated successfully, but these errors were encountered: