-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrame.sort() raises PanicException: 'arg_sort
operation not supported for dtype list[i64]
'
#10047
Comments
Just wanted to mention that this fixing this would be very helpful for a use case that I have. |
@sjt-motif You could convert it to a struct as a temporary workaround. df.select(pl.col("a").sort_by(pl.col("a").list.to_struct("max_width")))
# shape: (7, 1)
# ┌───────────┐
# │ a │
# │ --- │
# │ list[i64] │
# ╞═══════════╡
# │ [] │
# │ [0] │
# │ [1] │
# │ [1, 0] │
# │ [1, 1] │
# │ [1, 2] │
# │ [2, 3, 5] │
# └───────────┘ |
@cmdlineluser Thanks! It works great except the
That's quite easy for me to workaround though. Thanks! |
Hm yeah, not sure why I guess this is closer to what you want: (df.with_columns(sort_by = pl.col("a").list.to_struct("max_width"))
.sort("sort_by", descending=True, nulls_last=True))
# shape: (7, 2)
# ┌───────────┬──────────────────┐
# │ a ┆ sort_by │
# │ --- ┆ --- │
# │ list[i64] ┆ struct[3] │
# ╞═══════════╪══════════════════╡
# │ [2, 3, 5] ┆ {2,3,5} │
# │ [1, 2] ┆ {1,2,null} │
# │ [1, 1] ┆ {1,1,null} │
# │ [1, 0] ┆ {1,0,null} │
# │ [1] ┆ {1,null,null} │
# │ [0] ┆ {0,null,null} │
# │ [] ┆ {null,null,null} │
# └───────────┴──────────────────┘ |
Thanks @cmdlineluser, I basically did that, but dumber (manually find max length by Your way is wayyyy shorter lol. |
If this one gets fixed, then assert_frame_equal will also work for
|
@trinebrockhoff Even though it's related, perhaps that deserves its own issue? That particular use-case seems like something that warrants a higher priority. I also didn't realize at the time of my previous comment that you can pass expressions directly to df.sort(
pl.col("a").list.to_struct("max_width"),
descending=True,
nulls_last=True
)
# shape: (7, 1)
# ┌───────────┐
# │ a │
# │ --- │
# │ list[i64] │
# ╞═══════════╡
# │ [2, 3, 5] │
# │ [1, 2] │
# │ [1, 1] │
# │ [1, 0] │
# │ [1] │
# │ [0] │
# │ [] │
# └───────────┘ |
It also doesn't seem to be supported for lists of structs: schema = {"value": pl.List(pl.Struct({"a": pl.Int32}))}
data = {"value": [[{"a": 1}, {"a": 2}]]}
df = pl.DataFrame(data, schema=schema)
df.sort("value") Resulting in InvalidOperationError: `sort_with` operation not supported for dtype `list[struct[1]]` It would be really nice if |
@maxzw Yeah, it seems that example also causes df.group_by("value").all() thread '' panicked at crates/polars-core/src/frame/group_by/into_groups.rs:296:52:
PanicException: called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("cannot sort column of dtype `list[struct[1]]`")) |
Similar issue?
|
This seems similar:
Input dataframe is like this, from
|
Problem description
Using df.sort() with a list[i64] column raises an error pointing to:
https://github.com/pola-rs/polars/blob/master/polars/polars-core/src/series/series_trait.rs#L390-L399
Also mentioned as a part of #7777
Thanks!
Polars version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Expected behavior
The text was updated successfully, but these errors were encountered: