-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: dropping nuisance columns in DataFrame reductions #41480
DEPR: dropping nuisance columns in DataFrame reductions #41480
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. can you add a sub-section in deprecations as this is fairly user visible. ping on green.
pandas/core/frame.py
Outdated
@@ -9800,6 +9800,21 @@ def _get_data() -> DataFrame: | |||
# Even if we are object dtype, follow numpy and return | |||
# float64, see test_apply_funcs_over_empty | |||
out = out.astype(np.float64) | |||
|
|||
if numeric_only is None and out.shape[0] != df.shape[1]: | |||
# columns have been dropped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number here and below (this PR number is fine),
added a whatsnew subsection. this is actually just the first half of the note im about to push for #41475. the smart money says ive made a mess of the rst conventions regarding |
doc/source/whatsnew/v1.3.0.rst
Outdated
|
||
Deprecated Dropping Nuisance Columns in DataFrame Reductions and DataFrameGroupBy Operations | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with | |
The default of calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with | |
``numeric_only=None`` will silently ignore and drop from the result nuiscance columns, e.g. a string column in a .mean() reduction. |
doc/source/whatsnew/v1.3.0.rst
Outdated
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with | ||
``numeric_only=None`` (the default, columns on which the reduction raises ``TypeError`` | ||
are silently ignored and dropped from the result. This behavior is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start a new paragraph with 'This behavior is deprecated'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. can you rebase once more and ping on green.
ping |
…0 and enabling tests ### What changes were proposed in this pull request? This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as `sum`, `quantile`, `prod`, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail. ### Why are the changes needed? To match the behavior to latest pandas. ### Does this PR introduce _any_ user-facing change? Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above. ### How was this patch tested? Enabling & updating the existing UTs. Closes #42526 from itholic/pandas_stat. Authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…0 and enabling tests ### What changes were proposed in this pull request? This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as `sum`, `quantile`, `prod`, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail. ### Why are the changes needed? To match the behavior to latest pandas. ### Does this PR introduce _any_ user-facing change? Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above. ### How was this patch tested? Enabling & updating the existing UTs. Closes apache#42526 from itholic/pandas_stat. Authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…0 and enabling tests ### What changes were proposed in this pull request? This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as `sum`, `quantile`, `prod`, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail. ### Why are the changes needed? To match the behavior to latest pandas. ### Does this PR introduce _any_ user-facing change? Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above. ### How was this patch tested? Enabling & updating the existing UTs. Closes apache#42526 from itholic/pandas_stat. Authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
Discussed on this week's call