Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python): typing for Series methods that can return None #6690

Conversation

MatveyF
Copy link
Contributor

@MatveyF MatveyF commented Feb 5, 2023

Resolves #6623

Fix the type hints for some Series methods that return None.

Also Resolves this

In [4]: pl.Series([None]).cast(pl.Datetime).dt.median()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [4], line 1
----> 1 pl.Series([None]).cast(pl.Datetime).dt.median()

File ~/tmp/.venv/lib/python3.8/site-packages/polars/internals/series/datetime.py:94, in DateTimeNameSpace.median(self)
     74 """
     75 Return median as python DateTime.
     76
   (...)
     91
     92 """
     93 s = pli.wrap_s(self._s)
---> 94 out = int(s.median())
     95 return _to_python_datetime(out, s.dtype, s.time_unit)

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

This will now return None instead of raising an error.

Notes:

  • methods affected are min(), max(), mean(), median(), and quantile()
  • added unit tests for median() & mean() when used with datetime

@github-actions github-actions bot added fix Bug fix python Related to Python Polars labels Feb 5, 2023
@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Feb 6, 2023

Nice; FYI, as the tests have to run on python 3.7, you can't use the walrus operator := (which requires >= 3.8). If you can fix that, everything else looks good ;)

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MatveyF

Sorry to jump in, just wanted to point out that this would break the following:

result = pl.date_range(datetime(1969, 12, 31), datetime(1970, 1, 2), "1d").dt.median()
assert result == datetime(1970, 1, 1)

You probably need to explicitly check if out is not None rather than just if out

@MatveyF MatveyF force-pushed the fix(python)/typing-Series-methods-return-None branch from 4c966d8 to e4da7d2 Compare February 6, 2023 13:29
@MarcoGorelli
Copy link
Collaborator

Nice! I think you also fixed a bug here - the following fails on master

In [4]: pl.Series([None]).cast(pl.Datetime).dt.median()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [4], line 1
----> 1 pl.Series([None]).cast(pl.Datetime).dt.median()

File ~/tmp/.venv/lib/python3.8/site-packages/polars/internals/series/datetime.py:94, in DateTimeNameSpace.median(self)
     74 """
     75 Return median as python DateTime.
     76
   (...)
     91
     92 """
     93 s = pli.wrap_s(self._s)
---> 94 out = int(s.median())
     95 return _to_python_datetime(out, s.dtype, s.time_unit)

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

but works in your branch. Might be worth adding it as a test case to make sure nobody accidentally brings it back in the future?

…f a Series of `None` was casted to `pl.Datetime` first.
Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections, over to maintainers (test could be parametrised, instead of having two separate tests, something like

@pytest.mark.parametrize(
    ('value', 'expected'),
    [(dt.datetime(1970, 1, 1), (dt.datetime(1970, 1, 1)], [None, None])
)
def test_median(value, expected) -> None:
    result = pl.Series([value]).cast(pl.Datetime).dt.median()
    assert result == expected

, but I might be nitpicking here, feel free to ignore me 😄 )

@ritchie46
Copy link
Member

Thanks!

@ritchie46 ritchie46 merged commit 0ed60b3 into pola-rs:master Feb 7, 2023
Vincenthays pushed a commit to Vincenthays/polars that referenced this pull request Feb 9, 2023
@MatveyF MatveyF deleted the fix(python)/typing-Series-methods-return-None branch February 17, 2023 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Typing: pl.Series().min(), max(), probably others can also return None
4 participants