-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TypeError when merging categorical dates #16986
Changes from 9 commits
8843c10
48e7163
218da66
a81933d
b82d117
5e8e23b
3cc5c24
04d5404
21a35a0
1ea1977
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
# pylint: disable=E1103 | ||
|
||
import pytest | ||
from datetime import datetime | ||
from datetime import datetime, date | ||
from numpy.random import randn | ||
from numpy import nan | ||
import numpy as np | ||
|
@@ -1515,6 +1515,40 @@ def test_self_join_multiple_categories(self): | |
|
||
assert_frame_equal(result, df) | ||
|
||
def test_dtype_on_categorical_dates(self): | ||
# GH 16900 | ||
# dates should not be coerced to ints | ||
|
||
df = pd.DataFrame( | ||
[[date(2001, 1, 1), 1.1], | ||
[date(2001, 1, 2), 1.3]], | ||
columns=['date', 'num2'] | ||
) | ||
df['date'] = df['date'].astype('category') | ||
|
||
df2 = pd.DataFrame( | ||
[[date(2001, 1, 1), 1.3], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need testing on inner as well There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use parametrize instead of duplicating code here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you can do this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this has been updated as per your previous comment:
did you want it changed to use parametrize instead? |
||
[date(2001, 1, 3), 1.4]], | ||
columns=['date', 'num4'] | ||
) | ||
df2['date'] = df2['date'].astype('category') | ||
|
||
expected_outer = pd.DataFrame([ | ||
[pd.Timestamp('2001-01-01'), 1.1, 1.3], | ||
[pd.Timestamp('2001-01-02'), 1.3, np.nan], | ||
[pd.Timestamp('2001-01-03'), np.nan, 1.4]], | ||
columns=['date', 'num2', 'num4'] | ||
) | ||
result_outer = pd.merge(df, df2, how='outer', on=['date']) | ||
assert_frame_equal(result_outer, expected_outer) | ||
|
||
expected_inner = pd.DataFrame( | ||
[[pd.Timestamp('2001-01-01'), 1.1, 1.3]], | ||
columns=['date', 'num2', 'num4'] | ||
) | ||
result_inner = pd.merge(df, df2, how='inner', on=['date']) | ||
assert_frame_equal(result_inner, expected_inner) | ||
|
||
|
||
@pytest.fixture | ||
def left_df(): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this above L898, e.g. something like
then use
lk_to
andrk_to
where you usedtyp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in that elif block, but it could be done in the one above:
but that doesn't seem cleaner to me - if we spread the
lk_to
/rk_to
logic all over the method, then we make it much more difficult to debug compared with having all of the coercion logic in the block at the bottom where the coercion happens.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, pls add an instructive comment block here explaining what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done + minor tweaks to avoid calling is_categorical_dtype repeatedly.