Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect type when dividing literal series of UInt64 with Int64. #6805

Closed
2 tasks done
ritchie46 opened this issue Feb 11, 2023 · 3 comments · Fixed by #6841
Closed
2 tasks done

Incorrect type when dividing literal series of UInt64 with Int64. #6805

ritchie46 opened this issue Feb 11, 2023 · 3 comments · Fixed by #6841
Labels
bug Something isn't working

Comments

@ritchie46
Copy link
Member

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

Floor division results to a float type, where Uint was expected.

Reproducible example

s = pl.Series(values=[1], dtype=pl.UInt64) 
pl.select(pl.lit(s) // 2)

shape: (1, 1)
┌─────┐
│ │
│ --- │
│ f64 │
╞═════╡
│ 0.0 │
└─────┘

Expected behavior

shape: (1, 1)
┌─────┐
│     │
│ --- │
│ u64 │
╞═════╡
│ 0 │
└─────┘

@ritchie46 ritchie46 added the bug Something isn't working label Feb 11, 2023
@ghuls
Copy link
Collaborator

ghuls commented Feb 13, 2023

It is not limited to just //:

In [11]: pl.select(pl.lit(s) + 0)
Out[11]: 
shape: (1, 1)
┌─────┐
│     │
│ --- │
│ f64 │
╞═════╡
│ 1.0 │
└─────┘

In [12]: pl.select(pl.lit(s) * 0)
Out[12]: 
shape: (1, 1)
┌─────┐
│     │
│ --- │
│ f64 │
╞═════╡
│ 0.0 │
└─────┘

In [13]: pl.select(pl.lit(s) - 0)
Out[13]: 
shape: (1, 1)
┌─────┐
│     │
│ --- │
│ f64 │
╞═════╡
│ 1.0 │
└─────┘

@ghuls
Copy link
Collaborator

ghuls commented Feb 13, 2023

Somehow the python integer is not cast to same dtype than the series, resulting in a pl.UInt64 and pl.Int64 -> pl.Float64 conversion as which needs to happen when you would try to get the supertype of a pl.UInt64 and pl.Int64 series:

In [21]: s1 = pl.Series(values=[1], dtype=pl.UInt64)

In [22]: s2 = pl.Series(values=[1], dtype=pl.Int64)

In [23]: s2 + 1
Out[23]: 
shape: (1,)
Series: '' [i64]
[
	2
]

In [24]: s1 + 1
Out[24]: 
shape: (1,)
Series: '' [u64]
[
	2
]

In [25]: s1 + s2
Out[25]: 
shape: (1,)
Series: '' [f64]
[
	2.0
]

It works when split in 2 chained statements:

In [27]: pl.select(pl.lit(s).alias("a")).select(pl.col("a")  + 0)
Out[27]: 
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ u64 │
╞═════╡
│ 1   │
└─────┘

@ritchie46
Copy link
Member Author

Yes, I think I need to check the type_coercion optimizer to see if the literal fits into the columns dtype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants