Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct is_null, is_not_null and drop_nulls inconsistent behavior #9443

Closed
2 tasks done
paladin158 opened this issue Jun 19, 2023 · 2 comments · Fixed by #13921
Closed
2 tasks done

Struct is_null, is_not_null and drop_nulls inconsistent behavior #9443

paladin158 opened this issue Jun 19, 2023 · 2 comments · Fixed by #13921
Assignees
Labels
A-dtype-struct Area: struct data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars

Comments

@paladin158
Copy link

paladin158 commented Jun 19, 2023

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

It seems like drop_nulls gives inconsistent behavior between python v0.17.15 and v0.18.3. And, the behavior of is_null is different from rust v0.30.0.

Reproducible example

import polars as pl

df = pl.DataFrame({"a": [1, 2, None], "b": [1, 2, 3]})

df.select(pl.struct(["a", "b"]).drop_nulls())
### results
## py v0.17.15
# {1,1}
# {2,2}

## py v0.18.3
# {1,1}
# {2,2}
# {null,3}

## rust v0.30.0
# {1,1}
# {2,2}

s = df.select(pl.struct(["a", "b"]).to_series()

s.is_null())
### results
## py v0.17.15
# false false false

## py v0.18.3
# false false false

## rust v0.30.0
# false false false

s.is_not_null())
### results
## py v0.17.15
# true true false

## py v0.18.3
# true true false

## rust v0.30.0
# true true false

s.drop_nulls())
### results
## py v0.17.15
# {1,1}
# {2,2}

## py v0.18.3
# {1,1}
# {2,2}
# {null,3}

## rust 0.30.0
# {1,1}
# {2,2}

Expected behavior

All three should have the same behavior as rust v0.30.0: drop null if at least one values on struct is null for drop_nulls.

Not sure whether is_null and is_not_null should have the opposite result.

Installed versions

Polars v0.17.15
Polars v0.18.3
@paladin158 paladin158 added bug Something isn't working python Related to Python Polars labels Jun 19, 2023
@paladin158 paladin158 changed the title Struct drop_nulls behavior Struct is_null and drop_nulls behavior Jun 19, 2023
@paladin158 paladin158 changed the title Struct is_null and drop_nulls behavior Struct is_null and drop_nulls inconsistent behavior Jun 19, 2023
@paladin158 paladin158 changed the title Struct is_null and drop_nulls inconsistent behavior Struct is_null, is_not_null and drop_nulls inconsistent behavior Jun 20, 2023
@ritchie46
Copy link
Member

There is ambiguity here. In polars we count a struct row as Null if all fields are Null.

@stinodego stinodego added needs triage Awaiting prioritization by a maintainer A-dtype-struct Area: struct data type labels Jan 13, 2024
@stinodego
Copy link
Member

stinodego commented Jan 22, 2024

In current Polars, everything works correctly except for:

import polars as pl

df = pl.DataFrame({"a": [1, 2, None], "b": [1, 2, 3]})

s = df.select(pl.struct(["a", "b"])).to_series()

r = s.is_not_null()
print(r)
shape: (3,)
Series: 'a' [bool]
[
        true
        true
        false
]

@stinodego stinodego added P-high Priority: high and removed needs triage Awaiting prioritization by a maintainer labels Jan 22, 2024
@stinodego stinodego self-assigned this Jan 23, 2024
@c-peters c-peters added the accepted Ready for implementation label Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-struct Area: struct data type accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants